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INTRODUCTION 


The theory of matrices had its origin in the theory 
of determinants, and the latter had its origin in the 
theory of systems of equations. From Vandermonde and 
Laplace to Cayley, determinants were cultivated in a 
purely formal manner. The early algebraists never suc- 
cessfully explained what a determinant was, and indeed 
they were not interested in exact definitions. 

It was Cayley who seems first to have noticed that 
“the idea of matrix precedes that of determinant.” More 
explicitly, we can say that the relation of determinant 
to matrix is that of the absolute value of a complex 
number to the complex number itself, and it is no more 
possible to define determinant without the previous con- 
cept of matrix or its equivalent than it is to have the 
feline grin without the Cheshire cat. 

In fact, the importance of the concept of determinant 
has been, and currently is, vastly over-estimated. Sys- 
tems of equations can be solved as easily and neatly 
without determinants as with, as is illustrated in Chap- 
ter I of this Monograph. In fact, perhaps ninety per cent 
of matric theory can be developed without mentioning 
a determinant. The concept is necessary in some places, 
however, and is very useful in many others, so one 
should not push this point too far. 

In the middle of the last century matrices were ap- 
proached from several different points of view. The 
paper of Hamilton (1853) on “Linear and vector func- 
tions” is considered by Wcdderburn to contain the be- 
ginnings of the theory. After developing some properties 
of “linear transformations” in earlier papers, Cayley 
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finally wrote “A Memoir on the Theory of Matrices” in 
1858 in which a matrix is considered as a single mathe- 
matical quantity. This paper gives Cayley considerable 
claim to the honor of introducing the modern concept 
of matrix, although the name is due to Sylvester (1850). 

In 1867 there appeared the beautiful paper of La- 
guerre entitled “Sur le calcul des systemes lineaires” in 
which matrices were treated almost in the modern man- 
ner. It attracted little attention at the time of its 
publication. Frobenius, in his fundamental paper “Uebor 
lineare Substitutionen und bilineare Forinen” of 1878, 
approached matric theory through the composition of 
quadratic forms. 

In fact, Hamilton; Cayley, Laguerre and Frobenius 
seem to have worked without the knowledge of each 
others’ results. Frobenius, however, very soon became 
aware of these earlier papers and eventually adopted the 
term “matrix.” 

One of the central problems in matric theory is that 
of similarity. This problem was first solved for the com- 
plex field by means of the elementary divisor theory of 
Weierstrass and for other rings by H. J. S. Smith and 
Frobenius. 

In the present century a number of writers have 
made direct attacks upon the problem of the rational 
reduction of a matrix by means of similarity transfor- 
mations. S. Lattes in 1914 and G. Kowalewski in 1916 
were among the pioneers, Kowalewski stating that his 
inspiration came from Sophus Lie. Since that time many 
versions of the rational reduction have been published 
by Dickson, Turnbull and Aitken, van der Waerden, 
Menge, Wedderburn, Ingraham, and Schreier and Sper- 
ner. 

The history of these rational reductions has been 
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interesting and not without precedent in the field of 
mathematical research. The early reductions were short, 
requiring only a few pages. It is not prudent to say that 
any of the early papers is incorrect, for certainly a cor- 
rect result was obtained in each case, but some of them 
contained arguments which were convincing only to 
their authors. The exposition in places was certainly too 
brief. Later writers subjected these difficult passages to 
closer scrutiny, as well as to the fierce fire of generaliza- 
tion, with the result that an adequate treatment was 
found to take many pages. The book of Schreier and 
Sperner, to which the present writer acknowledges in- 
debtedness, contains 133 pages. 

A large part of the profit which has come from this 
mathematical Odyssey has been the by-products. In at- 
tempting to justify certain steps in the proof, basic 
theorems on vectors and matrices were uncovered, 
theorems which had not previously come to notice. Of 
this origin are the theorems on the polynomial factors 
of the rank equation of a matrix — facts which should 
have been known long ago but which for some peculiar 
reason escaped discovery. 

The present book is an attempt to set forth the new 
technique in matric theory which the writers on the ra- 
tional reduction have developed. The long proofs have 
been broken down into simpler components, and these 
components have been proved as preliminary theorems 
in as great generality as appeared possible. With the 
background developed in the first five chapters, the ra- 
tional reduction of Chapter VI does not seem difficult 
or unnatural. 

That the vector technique will have other applica- 
tions in matric theory than to the problem which 
brought it forth is quite certain. The Weyr theory for a 
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general field was easily established (§55) once the key 
theorem (Corollary 57) was known. The orthogonal re- 
duction (Chapter VIII) surrendered without a struggle. 

The author wishes to express his appreciation of the 
kindness of Professors Richard Brauer, Marguerite 
Darkow, Mark Ingraham, and Saunders MacLane, who 
have read the manuscript and offered valuable sugges- 
tions. While no attempt has been made to credit ideas 
to their discoverers, it should not be out of place to state 
that the author has been greatly influenced by the work, 
much of it unpublished, of his former colleague, Mark 
Ingraham. 

Cyrus Colton MacDuffee 

Hunter College qf the City of New York 
September 1, 1942 
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VECTORS AND MATRICES 




CHAPTER I 


SYSTEMS OF LINEAR EQUATIONS 
1. Graphs. A solution of the equation 
2x + 3y - 6 = 0 

is a pair of numbers (xi, y i) such that 
2xi + 3^i — 6 = 0. 

There are infinitely many such solutions. A solution of 
the system of equations 

2* + 3y - 6 = 0, 

4.v - 3y - 6 = 0 

is a pair of numbers (xi, y \ ) which is a solution of both 
equations. There exists just one such solution, namely 
(2,2/3). 

If we picture (x, y) as a point on the Cartesian plane, 
the infinitely many solutions of the equation 

2x + 3 y -6 = 0 

are the points of a straight line Zi, known as the graph of 
the equation. The second equation 

Ax — 3y — 6 = 0 

also has a graph h which is a straight line. The point of 
intersection of the two lines, namely (2, 2/3), is the solu- 
tion of the system of the two equations. 

The point (2, 2/3) is evidently the point of intersec- 
tion of the linex*=2 with the line y = 2/3. Thus the prob- 
lem of solving the system of equations (1) is equivalent 
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to the problem of finding the vertical line and the hori- 
zontal line which pass through the intersection point of 
their graphs. 



All methods of solving a system of equations such as 
(1) are but variations of one and the same process. Let 
k\ and kt be any two numbers not both 0. The equation 

ki(2x + 3y — 6) + ki(4x — 3y — 6) = 0, 


or 

(2) (2ki -j- 4ki)x + (3Ai — 3k 2 )y — 6£i — 6A2 = 0, 
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is clearly the equation of a straight line, for the coeffi- 
cients of x and y cannot both be 0 unless k\ = £2 = 0. This 
line passes through the intersection point of the two 
given lines; for if (# 1 , yi) is this intersection point, it is 
true for all values of £1 and £2 that 

k\( 2 xi + 3 yi — 6 ) + £2(4x1 — 3 yi — 6 ) = £1 0 + £2 0 = 0 . 

Now for various choices of £1 and £ 2 , the line (2) repre- 
sents every line of the plane through (xi, yi). This can be 
proved by showing that, if (* 2 , ^ 2 ) is an arbitrarily 
chosen point of the plane different from (xi, yi ), there is 
a choice of £ 1 , £2 not both zero such that (2) passes 
through this point. Let £i, £2 be unknown, and set 

k\{2xi + 3y 2 — 6) + £ 2 ( 4*2 3y 2 — 6) = 0. 

We may choose 

£1 = 4x 2 — 3y 2 — 6, £2 = — 2x 2 — 3y 2 + 6. 

Since (X 2 , 3 ^ 2 ) is not on both the given lines, not both k\ 
and k* will be 0 . 

As the ratio £ii £2 varies, the line (2) turns about the 
point (xi, yi). The problem of solving the system (1) is 
the problem of finding the values of k\ and £2 such that 
(2) is first vertical, then horizontal. 

For (2) to be vertical, it is necessary and sufficient 
that the coefficient of y, namely 3£i — 3£a, shall be 0. Let 
£i=£ 2 = 1. Then (2) becomes 

6x + 0y — 12 = 0, 

whence Xi = 2. For (2) to be horizontal, it is necessary 
and sufficient that the coefficient of x, namely 2 &i+ 4£2, 
shall be 0. Let £1 = 2, £2 = — 1. Then 

Ox + 9y — 6 = 0, 


whence yi = 2/3. 
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2. Equivalence of systems. The principle illustrated 
in the above example is of general applicability, but 
when more than three unknowns are involved, or when 
the coefficient field is not real, the geometric interpre- 
tation becomes artificial. Suppose that we have a system 
of m equations in n unknowns, 


(3) 


fl = All X\ + 012 *2 + ' * ’ + ( Un Xn “ C \ — 0, 
fz = 21 *1 + <*22 Xz + * * * + 02rt ^2 = 0, 


f m = + 0m2.V2 + * ■ • + O mn X n ~ C m = 0, 

with coefficients in any field. A solution of the equation 
/, =0 is a set of numbers (*/, x 2 ', • • • , x n ') such that 

0 , 1*1 + a t2 x{ 4- • • • + a tn Xn — c % = 0. 

A solution of the system (3) is a set of numbers which is 
a solution of every equation of the system. 

Suppose that there is another system of equations 

gl = ^11*1 b\ 2 X 2 “f* * ‘ ‘ bln%n d\ = 0, 
g 2 == ^21*1 + bzzXz + * ‘ ’ + bznXn ^2 = 0, 


gk = bklXi + 6 * 2*2 + • ■ • + bknXn — dk — 0 . 

The two systems (3) and (4) are called equivalent if 
every solution of each is a solution of the other. 

The process of solving a system of equations is the 
process of finding an equivalent system of simplest pos- 
sible form. Thus (3) is equivalent to 

xi = hnpi + hupz + • • • + hupi + ei, 
x 2 =* hnpi + ^ 22^2 + • ■ • + hzipi + e 2t 


X n =» hnlPl + knZpl + " * * + Klpl + 0» 
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where pu p 2 , • • • , pi are parameters which may assume 
arbitrary values in the coefficient field. 

Let us consider System (3). Let ku k 2t • • • f k m be any 
numbers, and form the linear equation 

kifi + k 2 fi + * * • + kmfm = 0 . 

Consider the system of equations 

ft = 0, 


A- 1 = 0, 

(5) k\f\ + k 2 f 2 + • * • + k m fm — 0, 

ft 11 = 0 , 


fm = 0. 

Clearly every solution of System (3) is a solution of Sys- 
tem (5). Conversely, let (jc/, x 2 , • • • , *»') be any solu- 
tion of (5). It is evidently a solution of every equation of 
(3) except possibly /,= 0. But the i-th equation of (5) 
reduces to kj t = 0, so if k t 5*0, it is also true that/»=0, 
and every solution of (5) is a solution of (3). Hence we 
have 

Theorem 1. If in a system of equations (3) the i-th 
equation is replaced by 

klfl + £ 2/2 + • • • + k m f m = 0 , k % ^ 0 , 

the new system is equivalent to the given system . 

All methods of solving a system of equations, even 
the method by determinants, employ the above princi- 
ple. 
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3. Elementary operations. There are three elemen- 
tary operations which can be performed upon the equa- 
tions of a system to yield an equivalent system. We 
shall call these the elementary operations of Types I, 
II and III. They are: 

Type I. The interchange of two equations of the s>s- 
tem. 

Type II. The multiplication of an equation of the 
system by a number 

Type III.* The addition to any equation of the sys- 
tem of k tin _'S any other equation of the system. 

The proof that each of these elementary operations 
when applied to a system of equations yields an equiva- 
lent system is now ini mediate. That an operation of 
Type I leaves the common solutions unchanged is evi- 
dent. Operations of Types II and III are special cases of 
the operation of Theorem 1. Furthermore, the operation 
of Theorem 1 can be achieved by one operation of Type 
II followed by m — 1 operations of Type III. That is, we 
first replace /. =0 by kj x = 0 where £.5*0, then replace 
this by kifx+k^ — 0, and so on. 

4. Systems of homogeneous equations. Let us now 
restrict attention to a system of homogeneous equations 

/l = <*11 Xi + dl2 *2 + ‘ ‘ • + din X n = 0, 

/ 2 = d 2 l Xi + d 2 2 X2 + * * * + d 2 n X n = 0, 

( 6 ) 

fm = dmiJCi + d m 2 *2 + ‘ - + d m »*» = 0. 

•These operations are not independent, for an operation of 
Type I can be obtained by a succession of operations of Type III 
with ib«l and operations of Type II. 
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A system of equations is called triangular if the last 
coefficient of / TO _i is 0, the last two coefficients of f m . 2 
are 0, • • • , the last m — 1 coefficients of f\ are 0. If 
m>n, this means that all the coefficients of f\, ft, • • ■ , 
/m-„ are 0 or, as we shall say, that these pol>nomials 
vanish. 

Theorem 2. The system (6) of homogeneous equations 
is equivalent to a triangular system . 

If some coefficient of x n is not 0, we can by an inter- 
change of equations if necessary insure that a mn y* 0. By 
adding to the first equation — tti n /a mn times the last 
equation, we can make the new coefficient in the place of 
«i» equal to 0. Similarly we can make every coefficient 
of x n except a mn equal to 0. If at the start every coeffi- 
cient of x n was 0, no reduction was required. 

Now ignore the last equation. Unless every coeffi- 
cient of x n - 1 (above a m , n -i ) is 0, we can assume that 
a m - 1 . n- 15^0 and as before make every other coefficient 
of Xn-\ equal to 0. In this way we obtain a system of 
equations of triangular form equivalent to (6). If m>n, 
the first w— n equations have vanished, each coefficient 
having become 0. 

In every triangular system the number of non-van- 
ishing equations is m^n. By filling in with vanishing 
equations we may assume that w =«. In this form the 
coefficients On, 022 , • • • , a n n arc called the diagonal co- 
efficients . If the system is triangular, every coefficient to 
the right of the diagonal coefficients is 0. 

Theorem 3. The system (6) of homogeneous equations 
is equivalent to one of triangular form in which every 
diagonal coefficient is either 0 or 1 ; and if the diagonal co- 
efficient in any equation is 0, the equation vanishes. 
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Let the equivalent triangular system be 



011*1 

- o, 


021*1 + 022*2 

= o, 

(7) 

031*1 + 032*2 + 033*3 

= 0 , 


0nl*l + 0»2*2 + ’ * * + 0fm*n 

= 0 . 


Suppose that the jth equation 

Si = djixi + a, 2*2 + ■ • • + dux, = 0 

is the last one which has its diagonal coefficient «„=(), 
but does not vanish. Let a Jt be the last coefficient of this 
equation which is not 0 . If in the ith equation «,,=(), 
we may interchange the ith and j th equations to obtain 
a new triangular system having one more non-zero 
diagonal coefficient and such that the new jth equation 
has one more zero coefficient to the right of its last non- 
zero coefficient. If, on the other hand, a„ 5^0, we may 
add — a, t /a„ times the ith equation to the jth equation 
and thus secure a new jth equation which either vanishes 
or has one more zero coefficient to the right of its last 
non-zero coefficient. In a finite number of steps we ob- 
tain a system in which every equation either vanishes or 
has a non-zero diagonal coefficient. 

Now the non-zero coefficients of the diagonal can be 
made l’s by elementary operations of Type II. 

Theorem 4. Every system of homogeneous equations is 
equivalent to a system of the form 

*1 = k\\P\ + knp2 + • • • + k\ ,n-rpH-rt 
. *2 = ktipi + k22p2 + ■ • • + ^2,r»-r^n-r» 


*w = knlpl + k n 2p2 + ‘ " " + £»,n-r^n-r 
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where pu Pt, • * - , pn-r are parameters denoting arbi- 
trary numbers of the coefficient field. 

This form (8) is known as the solved form of the sys- 
tem of equations, or the general solution of (6). The proc- 
ess of reducing the system (6) to the system (8) is known 
as solving (6). 

Let us start with a s>stem of the form described in 
Theorem 3. A certain number, say «— r, of the coeffi- 
cients in the sequence flu, 022 , • • • , a nn will beO. If a,, is 
the first of these which is 0, give to x t the arbitrary value 
pi. If a n is the second which is 0, give to x, the arbitrary 
value p 2 i etc. Every other x is equal to a linear combina- 
tion of x's with lower subscripts; so by eliminating these 
we reach the solved form of the system. 

In particular, when n = r, (8) reduces to the system 

a'i == 0, a*2 = 0, • • • » Xn == 11- 

Corollary 4. A system of m<n homogeneous equa- 
tions in n unknowns always has a solution not composed 
entirely of 0 's. 

5. Systems of non-homogeneous equations. Let us 
denote by 

fl — «11 Xl + «12 *2 + • • • + din X n — C\ = 0 , 

/ 2 = 021 Xi + <*22 X2 + * * * + 02 n X „ — C2 =0, 

W 


fm = dmlXi + d m 2 X 2 + ’ ’ * + d mn Xn “ C m = 0 

a system of m non-homogeneous equations in n un- 
knowns. The equations (6), obtained from (9) by rcplac- 
ing each c by 0, will be called the auxiliary homogeneous 
system. 
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Theorem 5. If equations (9) have a solution , the gen - 
eral solution can be expressed as the sum of the general 
solution of the auxiliary system (6) and one particular 
solution of (9). 

We may write (9) in the abbreviated notation 

n 

53 Qij%) — c t (i = 1, 2, • • • , m). 

Let (xi', x 2 ', • • • , Xn) and (xi", x 2 ", • • • f x n ") be any 
two solutions of (9). Set 2 , =x/ — x/'. Then 

5^ a., 2 , = 51 a *i( x i ~ */') = 23 a, jx/ — 53 

~ f'l ^ 0 , 

so that (zu 2 2 , • • • , 2n) is a solution of (6). Then x/ = 2 / 
+Xy". Now if (xi", x 2 ", • • • , x n ") is a particular solution 
of (9), we see that every solution is the sum of this par- 
ticular solution and a solution of (6). Conversely every 
such sum is a solution of (9). 

Corollary 5. If n linear homogeneous equations in n 
unknowns have only the trivial solution composed entirely 
of 0’$, then any non-homo geneous system having the given 
system as auxiliary system has one and only one solution . 

To solve a non-homogeneous system, we may pro- 
ceed almost in the same manner as in the solution of a 
homogeneous system. A certain sequence of elementary 
operations will reduce the system (6) to the form of 
Theorem 3. This same sequence of elementary opera- 
tions applied to the equations of the system (9) will re- 
duce it to the form 
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dii.i'i — c \ — 0, 

^21^*1 "f* «22-V2 ^ 2 = 0, 


(10) QiilX\ 4" fln2^2 "4" * * ‘ “I" (lnn^n C n 0, 

Cn i 1 = 0, 


- c m = 0 

where either a,,^0 or else every a,,= 0. The values of 
the a„ and c, in (10) are of course usually not the same as 
in (9). 

If a„=0 in (10), or if i>n, the *th equation reduces 
to r, =0, so if c,^0, the system (10) has no solution. If, 
on the other hand, r, =0 whenever «,,=0 or *>», we 
may set .v, =0 for ifkn and solve the remaining equa- 
tions step by step, thus obtaining a particular solution 
(h, / s , • • • , In). We have proved 

Theorem 6. A necessary and sufficient condition in 
order that the system of non-homo geneous equations in the 
diagonal form (10) have a solution is that every c, shall be 
0 if the corresponding coefficient a„ is 0 or if i>n. 

Corollary 6. A system of non-homogeneous equa- 
tions (9) has a solution if and only if £iCi +& 2 C 2 + • • • 
+& m c m =0 for every set of numbers k\, kt, • • • , k m such 
that 

kill "l" kilt * • • "b k m lrn — 0, 
l, = a,iXi + a,tX 2 + • • • + a, n x n . 

We shall state one more result, namely 
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Theorem 7. If the conditions of Theorem 6 are satis- 
fied, (9) is equivalent to the system 


•Vl = kupi + k\2p2 + * • 

* “j" k\,n—rpn—r 1 \t 

X-i = kupi + k«*p2 + * * 

* + k« ,n-rPn-r + 7*2» 

Xn = k fl \p\ + k n npo + • • 

* "1” k n,u—rpn—r “l - 7 »* 

where pi, /» 2 . * * • , p„- r are parameters representing arbi- 
trary numbers of the coefficient field, and (7j, h, • • • , 7 n ) 
is a particular solution. 


This theorem is a direct consequence of Theorem 5. 
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VECTOR SPACES 

6. Vectors in ordinary space. Wc shall assume that 
the reader is familiar with the use of vectors in ordinary 
Euclidean space to represent physical quantities, such 
as forces, velocities, or accelerations, which have both 
magnitude and direction. Let there be a set of three co- 
ordinate axes not all in the same plane which we shall 
call the x - , y-, and z-axes. Let € 1 , e 2 , € 3 be three line seg- 
ments (basic vectors), each of length s^O, emanating 
from the origin, e x on the x-axis, e 2 on the y-axis, and e 3 
on the z-axis. Addition and scalar multiplication of line 
segments which lie on the same line are defined in the 
ordinary way. 

Let </> be any line segment, or vector , emanating from 
the origin. A parallelepiped can be constructed having 0 
as a diagonal and having three of its edges on the respec- 
tive coordinate axes. The edge which lies on the jc-axis 
will be denoted by ai€i, the edge which lies on the y-axis 
by a 2 c 2 , and the edge on the z-axis by a & 3 . Clearly a\ will 
be positive if aid extends in the same direction as ci, and 
ai will be negative if ai€i extends in a direction opposite 
to that of ci. Relative to the basic vectors ci, c 2 , € 3 , the 
vector 0 has the components or coordinates a \ , a 2 , a 3 , and 
we write 

<t> = (ai, a 2 , a 3 ). 

Conversely every triple (ai, a 2 , a 3 ) of three real num- 
bers determines a unique vector <f>. The zero vector 
(0, 0, 0) is of length 0 so that its direction is quite im- 
material. 


15 
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The reader will recall that the resultant or sum of two 
vectors 

0 i = (iii, 02, 03), 02 = (bu bit bi) 

is the diagonal of the parallelepiped having the sum- 
mands as adjacent sides, and that in terms of the co- 
ordinates of the vectors this is equivalent to the identity 

01 + 02 — (01 + bu 02 + bi t 03 + is). 

The reader will also recall that 

£0 = (kau kdu £ 03 ) 

is a vector lying on the same line as 0 which is obtained 
by multiplying the segment 0 by the number k . This 
vector k<f> is called the 'scalar product of the vector 0 by 
the scalar k . The vector k<f> extends in the same or the 
opposite direction as 0 according as k is positive or 
negative. 

If 0 i and 02 are two vectors which do not lie* in the 
same straight line, they determine a plane. Every vector 
in the plane (having, of course, one end at the origin) is 
of the form 


Al0l + ^202» 

and conversely every such vector is in the plane of 0 i 
and 02. 

Furthermore, if 0 1, 02 , 08 are three vectors which do 
not lie in a plane, the set of all vectors 

^101 + ^202 + ^308 

• In deference to the usual notion that two vectors are equal if 
they have the same length and direction, we should word our assump- 
tion concerning ft and ft by saying that they are two vectors which 
cannot be laid on the same line. Similarly ft, ft, ft are three vectors 
,whtch cannot be laid on the same plane. 
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constitute all the vectors in space. We say that 0i, 0a» 
0 3 span the vector space. 

Three vectors span the space unless one of them lies 
in the plane containing the other two. That condition is 
equivalent to the statement that there exist three num- 
bers ku ki, kz not all zero such that 

k\fa + htfa + £ 3^3 = 0 . 

For if then fa lies in a plane containing fa and 0s; 

and conversely. If kz^Q, then fa lies in a plane contain- 
ing fa and 0 3 , and so forth. 

If 6 i, €2, €3 are mutually orthogonal unit vectors, the 
coordinates of 0 are the cosines of the angles which 0 
makes with the respective e's each multiplied by the 
length of the vector. If each of the vectors 

0i = (flt, 02, 03), 02 = (ii, &2, bz) 

is of unit length, the cosine of the angle between them is 
cos 0 = a\b\ + 02^2 4" afa. 

This expression is called the inner product or scalar prod- 
uct of the two vectors. If this inner product is 0, then 
cos 0 = 0, and the vectors arc orthogonal . 

7. Vectors in general.* This geometric concept is 
easily generalized. We define an ordered set 

0 = (01, at, • • • , a H ) 

of n numbers of a field F to be a vector of order or dimen- 
sion n t and call ai, o 2 , ■ • • , a* its components. Two such 
vectors arc equal if and only if corresponding compo- 
nents arc equal. 

* A reader with a liking for the abstract may read Chapter IX at 
this point. 
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We define the operation of scalar multiplication , 
where k is in F, by the identity 

k4> = ( ka \ , ia 2 , • • • , ka n ). 

If 0i = (6i, & 2 t • • • » &*) is another vector, we define the 
sum by the formula 

0 + ^1 3 (fll + bu 02 + £>2» # * * » A» + bn) 

and the inner product by 

<t> m <t>i = 0i&i + + • • • + a n b n , 

A set of vectors which is closed under the operations 
of scalar multiplication and addition is called a linear 
system of vectors, or a vector space. Thus if 4>u <t> 2 , • • • , 
4>m are m vectors of order n, the set of all linear combina- 
tions 

ki4>\ + k*b 2 + • • • + km<t>m 

of these vectors, where ku ku • • • , k m vary over F, is 
such a linear system S. The vectors 0i, <£ 2 , • * • , <f>m span 
S . 

The same space may be spanned by many different 
sets of vectors, even by sets which contain different 
numbers of vectors. Thus the two vectors (1, 0) and 
(0, 1) span the same space that is spanned by the three 
vectors 

(5,2), (- 1,1/2), (1/3,1). 

Two sets of vectors which span the same space are 
called linearly equivalent . 

Thp vectors 1 , <£ 2 , • • * , </> m are linearly dependent 
(with respect to F) if m numbers ku kt, • • • , k m of F ex- 
ist, not all zero, such that 

kifi + * 2^2 + • • • + km<t>m « 0. 
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If on the other hand every such relation implies that 
k Y = ki— • • • -k m =0, the vectors are linearly independ- 
ent. We recall that in three-space three vectors are lin- 
early dependent if and only if they are coplanar. 

A set of vectors 0i, 02, • • • , 0r which span S and are 
linearly independent constitute a basis for S. 

Theorem 8. Every linear system of nth order vectors 
over F has a basis 

0 i» 02, ■ • ■ » 0r r n. 

If there exist in 5 vectors all of whose components 
except the first are equal to 0, let 

(flu, 0, 0, • • • , 0) an 7 * 0 

be one such vector, and call it 0i. If there exist in 5 vec- 
tors of the form 

(<*21, <*22, 0, • • • , 0) 022 5^ o, 

choose one and call it 02 , and so on. If there is a vector of 
the form 

(<*wl, J n 2 y * * * , fl»n) <*»» 5^ 0, 

choose one and call it 0 n . Some of these types may not 
exist. Those which do exist give us a basis which, by a 
relabeling of the subscripts, we can call 0i, 02 , • • • ,0 r . 

First we must show that these r vectors are linearly 
independent. It will be simpler notationally to assume a 
relation 

^101 + ^202 + • • * + Ai»0n = 0 

among the 0’s as originally defined with the understand- 
ing that k x *=0 if 0, does not exist. Such a relation implies 
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kia\\ + ^2^21 + • * * + A w -10n-l,l + A n 0nl = 0, 
^2^22 + * * * + A„- 10n-l,2 + A»fl»2 = 0, 


An— 10n— l,n— 1 “I" An0n,»— 1 0, 

A nflnn ~ 0. 

If 0 n exists, a nn i£ 0, so A n =0. Thus A„ = 0 whether 0„ ex- 
ists or not. Then, similarly, A„_i =*0, and in fact every k » 
is zero. 

Finally we must show that every vector of S is a lin- 
ear combination of these vectors. Let 

0 = (Anl» A n 2, * * * » A nn ) 

9 

be any vector of S. If b n ^ 0, 0 n exists, and the vector 



Ann 


exists in 5, and has 0 for its last component. Let 


0 — (An— 1,1» A n — 1,2, * * * y A n _ i, n _l, 0) 


denote 0— An»0n/fl»n or 0 according as 0 n exists or does 
not exist. We proceed as before. If An-i, «_i?*0, then 


0 " 


, An— 1,»— 1 

0 ' 0-1 


has 0’s for its last two components. Finally we have 


Onn 

0 <t>n 

a n n 


All 

an 


01 * 0, 


All A** Ann 

0 = 01 i 02 + ■ • • H 0» 

flu fl22 fliu» 
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where some of these terms will be missing if the cor- 
responding 0, do not exist. 

Corollary 8. Unless the linear system S contains a 
vector of the form (<ii, at, • • • , a t , 0, • • • , 0) with a x i4 0, 
for every i, then S has a basis of fewer than n vectors . 

\Ye have now shown not only that every linear sys- 
tem S has a basis, but that there is a basis of a particu- 
larly simple form. Thus for n =3 we have a basis 

01 = (tfn, 0, 0) an 0, 

02 = (aai» 022 , 0) a 22 ^ 

03 = (031, 032, 033) 033 7* 0 

if all these 0’s exist, but any one or any two of them 
might be missing. The extreme case where all three are 
missing is conceptually possible but trivial, since then S 
consists only of the 0-vector. 

8. Rank of a linear system. The basis of a linear sys- 
tem is far from unique, but we shall prove the important 
theorem that the number r of vectors in a basis is the 
same for all bases. To this end we shall prove the simple 
but fundamental Replacement Theorem of Steinitz. 

Theorem 9. Let 0i, 0j, • • • , 0, be linearly independ- 
ent vectors , each linearly dependent upon the vectors 
fa, fa, • • • 0 r . Among fa, fa, • • • , 0 r there exists a sub- 
set of s vectors which we may notationally take to be fa, 
fa, • • • ,0, such that fa, fa, ■ • • , fa are linearly equivalent 
/a 0i, 02, • • • ,0.,0.+ 1 . • • • »0r. Thens^r. 

Let r be fixed ^ 1 and let 5 = 1. The statement that 
“0i is linearly independent” means that 40i=O implies 
4 = 0 — that is, that 0i?*O. Then 

01 885 0101 + 0202 + • • • + 0r0r 
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where not every a t is zero. It is only necessary to relabel 
the 0’s to insure that a i?*Q. Then 

1 d2 a r 

0i = — <t> i 02 — * * * 0r- 

CL\ d\ d\ 

Hence every linear combination of 0i, 02, • • • , 0r is a 
linear combination of 0i, 02, • • • , 0 r , and the converse 
is evident. Hence the theorem is true for 5 = 1. 

The proof may be completed by induction. Suppose 
the theorem to be true for 0 1, 0*, • • • , 0#- 1. Then by the 
induction hypothesis 0i, 02, • • • , 0 r are linearly equiva- 
lent to 0i, 02, • • • , 0,_i, 0„ • • • , 0 r , and in particular 

0, can be written in the form 

0# = #101 + • • • + a,_i0«_i + a,0, + • • • + <*r 0r- 

r 

Not all of the a„ a,+i, • • • , a r are 0, nor can it be true 
that r^5 — l f for 0i,02, • • • , 0, are linearly independent. 
It is a matter of notation to assume that a, 5^0. Then we 
can write 

0, = 6i0i + • • • + 6,0, + 6,+i0,+i + • • • + Mv- 

Hence the sets (0i, 02, • • ■ , 0,-i, 0„ • ■ • , 0 r ) and 
(0i, 02, * * * » 0*i 0.+i, • • • , 0r) are linearly equivalent. 
The theorem is now proved. 

Corollary 9a. //0i, 02, • • • , 0 r and 0 X , 0i, • • • , 0, 
aro /wo iasos 0/ /Ao samo /inear system , /Aon f =5. 

For in this case 0i, 0s, • • • , 0* are linearly independ- 
ent, and each is equal to a linear combination of 

01, 02, • • • , 0 r . Hence $ ^r. Similarly 0 4 , 0*, • • • , 0 r are 
linearly independent and each is a linear combination of 
0i, 02, • • • , 0# so that r£s. Hence r=$. 

It is now proper to define the rank of a linear system 
5 of order n as the number of linearly independent vec- 
tors in a basis. By Theorem 8, 5 has a basis composed of 
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r£n linearly independent vectors, and by Corollary 9a 
this number r is the same for all bases. That is, the rank 
of S is the number of linearly independent vectors in 
any set of vectors which span S. 

Corollary 9b. If S is a linear system of rank n , then 
more than n vectors of S are always linearly dependent . 

Corollary 9c. If S is a linear system of rank r, any 
r vectors of S form a basis for S if and only if they are 
linearly independent . 

Corollary 9d. If <p i, 02 , • • • , <t>>for s<r are linearly 
independent vectors of 5, it is possible to find r—s vectors 
0 , + i, • • • , 4>r such that fa, 0 2 , • • ■ , (prform a basis for S . 

9. The concept of matrix. A matrix 



“an 

012 * 

’ 01n“ 

A = 

d 21 

022 * 

• 02» 


-0ml 

0 m2 * ' 

‘ ‘ 0mn« 


may be thought of as an ordered set of vectors. That is, 
A consists of m rows, each row a< being a vector. If A is 
anmX« array, it consists of m row vectors, each of order 
w. If m = l, A is an ordinary vector of order », so that 
the concept of matrix generalizes the concept of vector. 
Clearly A may be written 

r«.i 


A - 


a t 


Oti — (flil, di 2, 




L- a m _J 

It is only natural to introduce two operations, addi- 
tion and scalar multiplication , derived from the addition 
and scalar multiplication of vectors. Thus if 
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B = 


01 

01 


0* — (^tl» ^*2* * * * » ^t»)» 


L 0m J 

it is natural to define A +B as follows: 

r«i +fii m 

«2 + 02 


^ + B - 


(ID 


L^m "1“ 0m« 

All + 6ll 
021 + &21 


01n + 6in" 
«2 n + &2» 


L^ml + 6ml * ' * 0mn + 


= (0r. + b rt )\ 


and if k is any scalar (i.e., any element of the field F), we 
define the scalar product by means of the formula 


kA 


“ ka\ “ 


~kan • • • kain 

kot2 

= 

kan • • • ka2n 

_ ka m _ 


-ka m 1 • • • kdmn— 


— (^0n). 


If 0 is any vector, it is natural to define multiplica- 
tion to be distributive. That is, 
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which is a vector with components in F whose com- 
ponents are written in a column instead of in a row. Thus 
if 

rn 


011 012 ‘ * 01 n 
021 022 ’ * ’ 02n 

,0ml 0 m2 * • * 0mn, 
011 ^1 + 012 ^2 + 
021 6 l + 022 &2 + 


b n J 

+ ( l\ n b n 
+ 02 n b n 


flmlbl + 0m2&2 + * " * + Q'innbn, 


A matrix 


'll 

b 12 • • 

• bu 

'21 

^22 * * 

• bii 

'nl 

bn2 ‘ * 

• bnl. 


= (W 


may alternatively be thought of as consisting of a row of 
l columns, each column being a vector 


B = (fix, Pit • • • » Pi)- 
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Then for any vector y = ( c \ , • • • , c n ) we define y • B 

as follows: 


yB = 7 (0i, 02, • • * , 0/) = (70 1» 702, • • • , 7^) 


— ( tl , £ 2 , • * * , Cn ) 


^21 


b 12 * • • ftl/* 
^ 22 * * • 62 / 


L * nl in 2 * ’ * 6 n/J 

= fal^ll + C 2 & 2 I + • • * + C„6 n l, * * * , 
C\b\l + C262/ + * • • + Cnbnl). 


Now comes the problem of defining the product of 
two matrices. Clearly it would be possible to employ the 
definition 



«1 


Ct\B 



<*]02 • 


AB = 

<*2 

B = 

ouB 

= 

OtlPi 

<*202 ’ 

* <*201 


- <*m - 


- O-rnB _ 


_«m0 1 

<*m02 ‘ ' 

* <*m01— 


But it would be an equally sensible definition if we wrote 

AB = >1(01, 02, • • • , 0/) = (>10i, >102, • • • , Apt) 
r «i0i a i02 • • • a i0i "I 

I «201 ** 202 • • • <*201 I 


L<*m01 <*m0 2 • • <*m0lJ 


Fortunately both results are the same. We define, then, 
the product of two matrices A and B, the first composed 
of row vectors of order m, the second of column vectors 
of orcler m, to be the matrix whose element in row r and 
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column s is the inner product of the rth row vector of A 
by the 5th column vector of B. That is, 

(12) AB = (a„)(U = ( £ a ri b 

\ »-l 

Note that the product AB can be formed only when 
the number of columns of A is equal to the number of 
rows of B. 

If B is an mXl matrix and O m is an mXm matrix 
whose elements are all 0, then 

O m B =0 


. is an mXl matrix all of whose elements are 0. Similarly 
if Oi is an IXl matrix all of whose elements are 0, then 

BOi = 0. 


If B is square, 0 w *=0i = 0. Such matrices, all of whose 
elements are 0, are called zero matrices . 

Let I n denote the matrix 


«1 


"1 0 • • • 0 " 


= 

0 1 • . • 0 

- «» - 


_0 0 • • • 1 _ 


(*r.) 


where ci, are the unit vectors of order n, and 

8„ is Kronecker’s delta, denoting 1 for r*=5 and 0 for 
r^s. If B is an m Xl matrix, then 

I«B = (£ SrA.) = (br.) - B, 

Bh - (£ b„Su) - (br.) = B. 

Thus I n is called the unit matrix of order n. If B is 
square, I m **h is both a left and a right unit multiplier. 



28 


VECTOR SPACES 


A matrix 



(M») = kl 


is called a scalar matrix . Since 

kl + // - (* +./)/, kl ll = (*/)/, 

all the scalar matrices of a given order are isomorphic 
under both addition and multiplication with the num- 
bers of F. 

The matrix i4 r = (d* r ) is called the transpose of the 
matrix A = (rt r ,). the second subscript denoting the row 
and the first subscript denoting the column in which an 
element stands. Thus a 34, which is in row 3 and column 
4 of A is in row* 4 and column 3 of A T . The row vectors 
of A T arc the column vectors of A and vice versa. It fol- 
lows directly from (11) and (12) that 

(A + B) r = A T + B T , ( AB) T - B T A T . 

The first of these equations is obvious, and the second 
follows from the relation 

(AB) T = (£ «„>,,) = (£ f>, r a„) = B T A T . 

A correspondence such as A*-*A T which reverses the 
order of multiplication but otherwise obeys the postu- 

* A biunique correspondence 

a*+a' t • ■ • 

between all the elements a, b t • • • of a set 2 and all the elements 
o', b\ • • • of another set 2i is an isomorphism relative to two cor- 
responding operations 0 of 2 and X of 2| if 
a*b*-+a'Xb' 

for all pairs of elements of the two sets. 
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lates for an automorphism is called an anti-automor- 
phism. 

Of fundamental importance is the fact that matric 
multiplication is associative. That is, 

(13) (AB)C = A(BC). 

This is immediately provable from (12), for each side of 
(13) is equal to 

^ yi QrtbijC • 

Multiplication is not usually commutative, however. 
For instance, 

GaCK3' 

c ac a-c a- 

10. Rectangular arrays. One of the interesting and 
remarkable properties of matric multiplication is the 
following. Let C be a rectangular array of / rows and m 
columns, and let A be a rectangular array of m rows and 
n columns. Let 


l - h + h + • * * + fp, m = mi + m 2 +•••+ m qt 
n = tti + n 2 + • • • + nt 

be any ordered partitions of /, m and n respectively. 
Draw lines through the arrays C and A separating them 
into arrays of subarrays Cu and Aa % respectively, such 
that Ctj has U rows and my columns, and i4 t -y has m< rows 
and ft/ columns. 
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Thus if 


C = 


+ 3 , 

m 

' = 4 

- 2 

~£ll 

C\2 

C13 

Cu~ 

C21 

C22 

^23 

C2A 

Cn 

C 32 

^33 

Cu 

Cil 

^42 

Cii 

Cu 

Cbl 

C52 

Cb3 

Cbi 

-C«l 

r«2 

Cb3 

C 64- 


"flu 

fl 12 

A 13 

flu" 

fl21 

fl22 

fl23 

fl24 

fl31 

fl 32 

a„ 

fl 3 4 

-fl41 

«42 

fl43 

«44_ 


It is true that in general 


CA 




= rc’„ c u r»-| 

“Lc„ c 2S cj’ 


[ ^11 - t 12 I 

A 2i /I 22 |. 

A 31 A*J 


(r— !,•••, ' 1 » 0* 


That is, these subarrays behave in this situation like the 
elements of a matrix. 

The idea of the proof can be obtained easily from the 
example without burdensome notation. The element in 
Row 4 and Column 2 of CA is 

C41«12 + C42<*22 + C43032 + ^44^42- 

This is the clement in the upper right-hand corner of 
C 21 /I 11 + C22^21 + CnA%\, 

In fact, the upper right-hand element of Csii4u contrib- 
utes the first two terms, the upper right-hand element of 
CnAn the next term, and the upper right-hand element 
of Ct»4si the last term. 
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ll. ^Igmentarv matricesJ lWe define three types of 
elementary operation or transformation upon the rows of a 
matrix as follows : 


Type I. The interchange of two row vectors. 

Type II. The multiplication of the elements of any 
row vector by the same non -zero number of F. 

Type III. The addition to any row vector of a mul- 
tiple of any other row vector. 

Thus the matrices 



are obtained from A =(a r ») by operations of Types I, 
II and III, respectively. 

Theorem 10. The inverse of every elementary opera - 
tion is an elementary operation of the same type . 

Clearly an elementary operation of Type I is its own 
inverse. The inverse of an elementary operation of Type 
II is obtained by replacing the multiplier by its recipro- 
cal. The inverse of an elementary operation of Type III 
\s obtained by replacing the multiplier by its negative. 

Theorem 11 . Every elementary operation upon the 
rows of a matrix leaves its row space invariant . 

The row space 5 of a matrix A is the space (or linear 
system) spanned by its row vectors <f>u <f> 2 , • • • , </>n* That 
the interchange of two of these vectors leaves S un- 
changed is trivial. Clearly every linear combination of 
either of the sets 
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01 , 02, ‘ • » C 4>11 * • • » 0»l 01, 02, * • • , 0. + ^0y, • * • , 0» 

is a linear combination of 0 i, 02, • • • , 0 n , so 5 is not en- 
larged by a transformation of Types II or III. Since each 
transformation has an inverse of the same type, S is left 
unchanged by any transformation. 

Theorem 12. Each elementary row operation is asso- 
ciative with matric multiplication . 

That is, if Q denotes the row operation, 

U(A3) - (S2A)B. 

Let (a r ')(b, t ) = We consider the three 

types of elementary transformation separately. 

Type I: The first subscript of a is the row index in 
both A and in AB , so an interchange in the order of the 
rows of A brings about this same interchange in the or- 
der of the rows of AB , and produces no other change. 

Type II: If the 7th row of A is multiplied by c and 
the other rows remain unchanged, then the 7th row of 
AB becomes 

2 ca jt b i9 . 

i 

That is, the7*th row of AB will be multiplied by c. 

Type III : Let the elements of the 7th row of A be re- 
placed by 

aji + kaiu djt + kau , • • • , djn + kain , 

the other rows remaining unchanged. Then the7*th row 
of AB will be 

£ + hau)b % 1, • • • » £ + kau)b{ n , 

which is equal to 

£ djibii + A 2 a iib%u * • • » 2 in + k 2 dubin- 

This is the same result that would have been obtained 
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from the application of the elementary operations to 
AB. 

A matrix which is obtained from the identity matrix 
I by an elementary operation is called an elementary 
matrix . We have elementary matrices of each of the 
three types, exemplified respectively by 


"1 

0 

0 

0“ 


“1 

0 

0 

0" 


“1 

0 

0 

0“ 

0 

0 

0 

1 


0 

c 

0 

l) 


0 

1 

0 

0 

0 

0 

1 

0 

» 

0 

0 

1 

0 

» 

0 

k 

1 

0 

_0 

1 

0 

0_ 


_() 

0 

0 

1 _ 


.0 

0 

0 

1 _ 


Theorem 13. Every elementary operation on the rows 
of a matrix can be accomplished by multiplying the matrix 
on the left by the corresponding elementary matrix . 

This follows directly from Theorem 12. For if S2 is 
any elementary row operation, 

04 = il(IA) = (ill) A. 

The row rank of a matrix is the rank of the linear 
system spanned by its row vectors. That is, it is the 
number of its row vectors which are linearly independ- 
ent. 

Theorem 14. The row rank of a matrix is never raised 
by multiplying the matrix on the left by another matrix . 

Let 



“ail 

an • 

• • ai m - 


■ ft “ 


7i 

AB = 

a2i 

a22 

• • a2m 


ft 

= 

72 


-an 

a i2 • 

* * a^m.. 


- ft. - 


- yi - 


Let 02. • • • , 0t be a basis of the linear system de- 
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fined by the7’s, and let 0i, 02, • • • , 0 r be a basis for the 
linear system defined by the j3’s. Since every 0 t - is equal to 
a linear combination of the 7*s, and every 7,- is equal 
to a linear combination of the j3’s, and every /3 < is 
equal to a linear combination of the 0’s, it follows that 
0i, 02, * • • , 0. arc linearly dependent upon 0 if 0 lf • • • , 
0 r . By Theorem 9, 5 igr. 

A matrix is called non-singular on the left if there ex- 
ists a matrix A" 1 such that 

A-'A « /. 

The matrix A" 1 is called a left inverse of A . 

Theorem 15. If A and B are each non-singular on the 
left , so is AB. A left inverse of AB is B~ l A~ l . 

For 

(B~ l A- l )AB = B~ l (A~ l A)B = B~ l IB = B~ l B = /. 

Theorem 16. Every elementary matrix is non-singular 
on the left , awd Aas /or aw inverse the elementary matrix 
corresponding to the inverse row operation . 

From Theorem 10 we know that every elementary 
transformation 0 has an inverse so that, for the identity 
matrix /, 

Qr'QI = /. 

Let Q/ = E, = E~ l . Then by Theorem 13, E~ 1 E = I. 

Theorem 17. TAo row rank of a matrix is never 
changed by multiplying the matrix on the left by a matrix 
which is non-singular on the left. 

If B is of rank r, then AB is of rank s£r by Theorem 
14, and A~ l AB = B is of rank r igs. Thus r 

Corollary 17. The row rank of a matrix is never 
changed by an elementary transformation on its rows. 
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12. A normal form. Every linear system of nth order 
vectors can be spanned by n vectors (Theorem 8), and if 
it is spanned by fewer than n vectors, we can adjoin 
0-vectors to bring the number up to n. Hence every 
space can be spanned by the row vectors of a square 
matrix. In this section and in the sections immediately 
following we shall assume that every matrix is square 
with n rows and n columns. 

The principal or main diagonal of a square matrix A 
= ( a r $ ) is the sequence of elements an, a», • • • , The 
matrix is triangular if every element above the principal 
diagonal is 0. 

Theorem 18. If A is a square matrix , there exists a 
matrix P non-singular on the left which is the product of 
elementary matrices such that PA is triangular with every 
diagonal element either 0 or 1 ; if the diagonal element in 
any row is 0, the entire row is 0; if the diagonal element in 
any column is 1, every other element of the column is 0. 
This form is unique. 

The first part of the proof is practically the same as 
the proofs of Theorems 2 and 3. Let 


an 

an • • 

• a u 

an 

a« • • 

• a% n 

-0*1 

a n t • • 



Unless every element of the last column is 0, we can 
interchange rows so that 0. Add to the ith row 
—a{ n /a nn times the last row so that the new a* n will be 
0. Thus in either case every element of the last column 
above a n% can be made 0. Similarly every element above 
the principal diagonal can be made 0, so that a triangu- 
lar matrix is obtained. 
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Let a t i be the last diagonal element which is 0. Let 
a t] be the last element of the ith row which is not 0. 
If 0, interchange the ith and jth rows. If a,,?* 5 0, 
add to the tth row —a x Ja u times thejth row. In either 
case the new element in the a tJ position has been made 
0 without changing any element to the right of a tJ in the 
ith row or any element of a row below the ith. We con- 
tinue until every element to the left of every 0 in the 
diagonal is 0. 

By elementary transformations of Type II, every 
non-zero diagonal element can be made 1 without de- 
stroying the reductions which have been made. The fol- 
lowing matrix is in the form thus far obtained: 

-() 0 0 o- 

fl21 10 0 

0 0 0 0 * 

-041 042 043 1 _ 

Disregard the last row and let a„ be the last diagonal 
element which is equal to 1. By adding — a n % times the 
ith row to the last row, we can make the new a n % equal 
to 0. Similarly every element under a,,* which is not al- 
ready 0 can be made equal to 0. Then if is the diago- 
nal element equal to 1 which just precedes a,,-, every ele- 
ment under can be made 0. We proceed until the 
form described in the theorem is attained. 

Each elementary operation was accomplished by 
multiplying the matrix on the left by an elementary 
matrix, and the product of these elementary matrices 
is the matrix P of the theorem. Since each elementary 
matrix is non-singular on the left, so is P, by Theorem 
IS. 

This form of a matrix will be called the Hermite 
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canonical form, for it was first discovered by Hermitc 
for non-singular matrices whose elements were rational 
integers. 

Notice that the row* vectors of the Hermite form con- 
stitute a basis for the row space of the type described in 
Theorem 8, and the rows which are not 0 are linearly 
independent. Thus the row* rank of a matrix is equal 
to the number of non-zero rows in its Hermite canonical 
form. 

We shall now prove that the Hermite form is unique. 
Consider the equation PII =A where P is non-singular. 
We assume that II and A are in Hermite form, and 
shall show' that A =11. It is true, then, that every h xx is 
either 0 or 1 ; if h tl = 0, then every element of the ith 
row is 0; and if h xx = 1, then every other element of the 
ith column is 0. The same normalization holds for the 
elements of A. 

The reader ma> be able to follow the proof more eas- 
ily by keeping in mind the example 


~P a 

P\2 

Pn 

P\\“~ 


"0 

0 

0 

()- 

P 21 

p22 

p23 

p2K 


2 

1 

0 

0 

P* 1 

Pi2 

pM 

p2A 


0 

0 

0 

0 

-P« 

Pi2 

PA2 

P 44 - 


_ 3 

0 

- 1 

1 _ 


fl] i 0 0 0 

021 022 0 0 
031 032 033 o 

-041 042 043 044- 

If the ith column of II has 1 in the main diagonal 
(and consequently 0’s in every other position), then the 
ith column of P is equal to the ith column of A. The 
undetermined columns of P correspond to rows of 0’s 
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of II, so that the product of every ith row of P by the ith 
column of II is p t %hu t and this must equal a %% . If h„ = 0, 
then a t% =0. Since II and A are of the same rank, 0 
implies a, ,7*0 and therefore a„ = 1. Then a Jt = 0 for 
j>i, since A is in Hermite form, and every column of 
P which corresponds to a non-zero row of II is a column 
vector with 1 in the main diagonal and 0’s elsewhere. 
Such a matrix P leaves II unaltered when used as a left 
multiplier so that A =//. 

13. Non-singular matrices. Let us now take the spe- 
cial case of a square matrix A of order and rank n. 

Theorem 19. If the rows of a square matrix are lin- 
early independent , it is non-singular on the left , and con- 
versely . » 

Let PA = H where II is the Hermite form. Since P is 
non-singular on the left, the row rank of A is equal to 
the row rank of H. If II had any zeros in its principal 
diagonal, the row rank of H would be less than n, con- 
tradicting the hypothesis that A is of row rank «. Hence 
11 = I. Since PA =/, A is non-singular on the left. 

If A is non-singular on the left, there exists a matrix 
P such that PA =/. The row rank of I is n so that, by 
Theorem 14, the row rank of A is ». This proves the con- 
verse. 

Theorem 20. Every square matrix which is non-singu- 
lar on the left is equal to a product of elementary matrices , 
and has a left inverse which is also equal to a product of ele- 
mentary matrices . 

For if A is non-singular on the left, it is of rank n , its 
Hermite form is /, and there exists a matrix P which is a 
product of elementary matrices such that PA =/. Let 
PA = EkEk-x • • • EiExA - I. 
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Then, upon multiplying on the left by the inverses, we 
have 

, .,-1 -i -i 

A = E\ E 2 • • • Ek-iEk • 

But the inverse of an elementary matrix is an elemen- 
tary matrix by Theorem 10. Thus A is a product of ele- 
mentary matrices. 

If a matrix A* exists such that AA* = I t then A is 
called non-singular on the right , and A * is called a right 
inverse of A . 

Theorem 21. If A is non-singular on the left , it is 
also non-singular on the right , and conversely . 

If A is non-singular on the left, there exists by 
Theorem 20 a matrix 

P = EkEk - 1 • • • E%E\ 

such that PA =/, the E ' s being elementary matrices. 
Then 

t - 1 ,- 1 r - 1 r ” 1 

A = Ei h 2 * • • Ek- iEk , 

and 

AP = eTeT • • • Iik-iE? EhEk-\ ■ • • = /. 

The converse is similarly proved. 

By virtue of this theorem, square matrices are either 
singular or non-singular, and we may drop the qualify- 
ing phrases “on the left” and “on the right.” 

Theorem 22. If A is non-singular , it has a unique 
left inverse and a unique right inverse , and these inverses 
are equal to each other . 

Suppose that A ~ 1 is a left inverse and A* a right in- 
verse of A. Then 
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AA* = /, A - U = /, 

J- l (.U*) = (J-UJ.J* 

so that A~*I = IA*, whence A~ l =A *. Thus every left 
inverse is equal to every right inverse. 

We shall now speak of the inverse .4 -1 of a non-singu- 
lar matrix. 

14, Column vectors. A matrix A can equally well be 
considered as defining the linear system of its column 
vectors : 


A = [\pu ^ 2 , ■ • • , f H \, tx = 


The reader will at once perceive the perfect duality of 
the two concepts. Every theorem of this chapter can be 
dualized by replacing “row” by “column” and making 
whatever changes are appropriate. The most important 
of these changes is that an elementary operation on the 
columns of a matrix is accomplished by multiplying the 
matrix on the right by an elementary matrix. (See 
Theorem 13.) The elementary matrices themselves are 
not different in form. 

This duality can be at once established by replacing 
A by its transpose A T . The column vectors of A become 
the row vectors of A T . Then if 

PA* = H 

where II is in the Hermite canonical form, it follows 
from (12) that 


a u 
an 


AP T = H T . 
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Thus II T is the Hermite canonical form of a matrix un- 
der elementary transformations on its columns. 

Two matrices A and B are called equivalent if two 
non-singular matrices P and Q exist such that 

PAQ - B. 

The term “equivalent” is justified by the following 
properties : 

(a) . A is equivalent to itself. For IAI^A. 

(b) . If A is equivalent to B } then B is equivalent to 
A. For P~ l BQ~ l =A. 

(c) . If A is equivalent to B and B is equivalent to C, 
then A is equivalent to C . For PA Q = B and RBS = C to- 
gether imply ( RP)A ( QS ) = C with RP and QS non-singu- 
lar. 

Theorem 23. Every matrix A is equivalent to a di- 
agonal matrix in which every diagonal element is either 0 
or 1 . 

By Theorem 18 we can find a matrix P such that 
PA = // where II is in the Hermite form. In this form 
the only non-zero elements not in the principal diagonal 
are to the left of diagonal elements which are equal to 1 . 
By elementary transformations on the columns, every 
non-diagonal element can be made 0 . The following 
matrix in Hermite form will assist the reader in seeing 
this. 

“1 0 0 0 0 " 

0 0 0 0 0 

0 03 2 1 0 0 . 

0 0 0 0 0 

_ 0 052 0 054 1 _ 
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Corollary 23. The row rank of a matrix is equal to 
its column rank . 

We note first that the theorem is true for the Her- 
mite canonical form II. Those columns which have l’s in 
the principal diagonal are linearly independent, are r in 
number where r is the row rank, and every other column 
is a linear combination of these. 

Now let A be any matrix. We may make it square by 
adjoining rows or columns of 0’s without altering either 
its row rank or its column rank. Let PAQ—D where PA 
is the Hermite form of A and D is a diagonal matrix 
having r diagonal elements equal to 1. Clearly both the 
row space and the column space of D are of rank r. The 
row rank of A is equal to the row rank of PA by Theo- 
rem 17, and we have just seen that this is equal to the 
column rank of PA. The column rank of PA is equal to 
the column rank of PAQ by the dual of Theorem 17. 
Thus the row rank of A is equal to r. Similarly the col- 
umn rank of A is equal to the column rank r\ of a di- 
agonal matrix D\ having fi diagonal elements equal to 1. 
Clearly D and Di are equivalent, and their row ranks 
are equal to their column ranks. If P\DQi = Di , the row 
rank of DQ\ is t\. But DQi has at least w — r rows of 0's 
so that rj^r. Since D~Pr l DyQr l , r^r x . 

We shall hereafter speak merely of the rank of a 
matrix. 

We may, in fact, simplify the canonical form of 
Theorem 23 somewhat further by interchanging rows 
and columns so as to bring the l's into the last r posi- 
tions in the principal diagonal. Thus every matrix of 
rank r is equivalent to a matrix of the type 
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where I r denotes the rXr identity matrix. This will be 
called the canonical form of a matrix under equivalence 
transformations . 

15. Systems of equations. The system of equations 

Oil *1 + • • • + Oln Xn * Cl, 

an xi + • • • + a 2n x n = c 2t 

(15) 

OmlX i + • • ' + a mn X n = C n 
can be written in matric notation as 

A4> = 7 

where 



~ Oil 

012 # 

* 01n ~ 


Xi 


Cl 

A = 

021 

022 * 

• 02n 

. «= 

X 2 

r 7 = 

c 2 


— 0ml 

0m2 * * 

’ 0mn _ 


- - 


_ Cm _ 


Theorem 24. If A is an mXn matrix of rank r, x there 
exist matrices P and Q where P is a product of elementary 
matrices and Q is a product of elementary matrices of Type 
I such that 



“1 

0 • 

• 0 

kl,r+l 

• • • kin"" 


0 

1 • 

• 0 

k 2 ,r+l 

• * * k 2n 

PAQ = 

0 

0 • • 

• 1 

kr,r+l 

■ • • Kn 


0 

0 • • 

• 0 

0 

• • • 0 


_0 

0- • 

• 0 

0 

•••0 _ 


(16) PAQ 
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Let A be reduced to the Hermite form. There will be 
r l’s in the principal diagonal, any row containing a 0 
in the principal diagonal will be a zero vector, and any 
column containing a 1 in the principal diagonal will be 
a unit vector. Now any permutation of the rows or col- 
umns will not change the number of these zero row vec- 
tors or the number of unit column vectors. By permut- 
ing the rows and columns, the l’s can be placed in the 
first r diagonal positions. The matrix will then be in the 
form (16). These permutations can be accomplished by 
multiplying the Hermite form on the right and left by 
elementary matrices of Type I. 

By this means we again arrive at the solution of the 
system of equations (15). The matrix Q represents a 
certain permutation of the columns of A . Apply this 
permutation to the columns of A, and also to the un- 
knowns *i, x 2 , • • • , x nt and relabel them in the normal 
order. Call the new matrix A'. Then there exists a ma- 
trix P such that PA f is of the form (16). To solved f <f> =7, 
we merely multiply on the left by P: 

PA'<j> = Py = 7'. 

Now the vector PA f <f> has its last n — r components equal 
to 0, and its ith component is 

n 

%i -f“ kijXj (f = 1» 2, • • • , r). 

y-r+l 

Hence unless the last n — r components of 7' are 0, there 
is no solution. If these are 0, we have immediately the 
solution described in Theorem 7. 

16. On the rank of a product. The following theorem 
which is readily provable at this point will be found use- 
ful later on. 
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Theorem 25. If A and B are nXn matrices of rank r 
whose row vectors span the same space , and if s is any in- 
teger rrgsgn, there exists a matrix P of rank s such that 
A =PB. 

If A is of rank r, we can pass to the Hermite form 
and then permute rows so that 



where Ai is an rXn matrix whose row vectors a it 
a 2t • • • , a rt are linearly independent. Similarly we can 
write 



where 0i, 02, • • • , j 8 r are the row vectors of B\. Both Pi 
and Q\ are non-singular. If the rows of A and B span the 
same space, every row vector of A \ is a linear combina- 
tion of the row vectors of B u and vice versa. Thus there 
exist numbers c xj , d,-,-, of Fsuch that 

r r r 

«* = 0i = 5Z djkOtkt a x = 52 C X jdjkOtk» 

i-i *-i j, Jt-i 

But ay , as, ' " " , <% r are linearly independent, so 

r 

^ Cijdjk = 
f-l 

That is, CZ> = / r , Ai=*CBi, B\ = DA\ where 

d\\ dn • • • dlr~l 
d 2 i d 2 2 • • • d 2r I 



Cr 1 Cr2 • • • G 


drl d r2 • • • dfr 
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Since C and D are inverses of each other, each is non- 
singular and hence of rank r. Now let 



where E is any (n-r)X(w — r) array. By §10, 


rc cnr^fi rcB n va{\ 

" e,s “Lo JUJ” [ 0 J “ Lo J ■ 1 ■ 


Set P*=Pr l MQi. Then A =PB. But E is arbitrary, so 
by a proper choice of E } M may be made to have any 
rank s from r to n inclusive. Since P\ and Qi are non- 
singular, P has the same rank as M . 



CHAPTER III 
DETERMINANTS 


17. Complex numbers. Consider the set of all mat- 
rices of the form 


A = 




where x , y, u , v, • 
product , 

,4 + B = 


AB = BA = 


• • are real numbers. The 5um and 

t x + u y + vl 

— (y + v) a + u J 

t xu — yv xv + yu“\ 

— ( xv + yu ) — yv J 


are both of the same form as A and B — that is, they 
have their diagonal elements equal to each other and the 
other two elements negatives of each other. In particu- 
lar the zero matrix and the identity matrix, 


0 = 




are of this form. Thus the set of all matrices of this form 
is closed under addition and multiplication, and consti- 
tutes a matric algebra with unit element. 

This algebra will be a field* if every matrix A except 
0 has an inverse. The product AB will equal I if u and v 
are so chosen that 


xu — yv = 1, yu + xv * 0. 
* See $63 for the definition of a field. 
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The only possible solution is 

x — y 

u = ) v = ) 

x 2 + y 2 x 2 + y 2 

which exists for every A except A = 0. Thus the above 
set of matrices is actually a field. 

The correspondence 



defines an isomorphism (§9) of a subset of our matric al- 
gebra with the real field R. The matrix A can be writ- 
ten 

where 

Clearly I corresponds to 1 under the above isomorphism. 
Moreover 




Hence the correspondence 


A 



<-> x + yi 


defines an isomorphism, or representation of the com- 
plex field as a matric algebra over the real field. 

The polynomial x 2 +y 2 is the least common denomi- 
nator of the elements of A~ l when x and y are indeter- 
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niinates, and we denote this polynomial by d(A) and 
call it the determinant of A. Every particular matrix A 
whose elements arc real numbers has a determinant 
which is obtainable from x 2 +y 2 by giving to x and y 
their appropriate values. 

If u and v also are indeterminates, 

d(AB) = ( xu — yv ) 2 + ( xv + yu ) 2 

= x 2 u 2 + y 2 v 2 + x 2 v 2 + y 2 u 2 • 

= (x 2 + y 2 )(u 2 + v 2 ) = d(A)d(B). 

Thus the determinant of the product of the matrices 
which are isomorphic with two complex numbers is 
equal to the product of their determinants. If A cor- 
responds to the complex number a , it is at once evident 
that d(A) is the norm of a, that is, the square of its ab- 
solute value. 

18 . Matrices as hypercomplex numbers. Let F be 
any field, and let M consist of the set of all nXn mat- 
rices with elements in F. If 

A = (#f«), B — (bra), C ** (^r«)» * * * 
are these matrices, we have defined (§9) 

A + B - (Or, + b„), AB = (£ a n bu). 

It is clear that M is a commutative group* with respect 
to addition, the identity being the matrix 0 all of whose 
elements are 0. With respect to multiplication, however, 
we have only closure, the associative law, the distribu- 
tive law, and the existence of the identity / = ( 6 r »). The 
two other properties which would be required for M to 
be a field are not necessarily present when n > 1, namely 

* See S 62 for the definition of group. 
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the commutative law of multiplication and the existence 
of the reciprocal of every matrix except O. Thus M is an 
instance of a ring with unit element. 

We shall investigate the possibility of associating 
with every matrix of M a valuation or number d(A) such 
that 

d(AB) = d(A)d(B). 

Such a correspondence A— >d(A) can always be accom- 
plished by defining d(A) to be 0 for every matrix A, or 
1 for every matrix A, or 0 if A is singular and 1 if A is 
non-singular. But these valuations are quite trivial. If 
there is one non-trivial valuation where d(A) is neither 
0 nor 1 for some matrix A , there are infinitely many, for 

*A -+ [</G4)]* 

is another non-trivial valuation for 1. 

Let X and Y be two matrices whose elements are 
independent indeterminates x r9 , y r » • We shall endeavor 
to find a polynomial d(X) in the elements x r • which is of 
lowest possible degree >0 such that 

(17) d(XY) = d(X)d(Y). 

If such a polynomial can be found, then for every ma- 
trix A of M we may define d(A) to be that number of F 
which is obtained by replacing the elements x r $ of X 
in d{X) by the respective elements a r » of A . 

Since 

d(X) = d(X I) - d(X) d(I) 

and d(X) is of degree >0, it follows that d(I) = 1. Simi- 
larly 

d{0) = d(XO) = d(X)d(0) 9 

rf(0) ll - rf(X)] = 0. 

But l'—d(X) is of degree >0 so that d(0) *»0. 



MATRICES AS HYPERCOMPLEX NUMBERS 


51 


We shall now attempt to determine the valuations 
of the elementary matrices. Nothing will be lost except 
complexity of notation if we restrict attention to the 
case w = 3. 


Let 


U 


“0 1 O' 

1 0 0 
.0 0 1 . 


be any elementary matrix of Type I. Clearly t/ 2 = J, so 
that 

[<*(£/)]’ = 1, d(U) = ± 1. 

Now let 


V{k) = 


1 k 0- 
0 1 0 
.0 0 1 . 


be any elementary matrix of Type III, k being an in- 
determinate. Since V(k ) • 7(— k) =7, 

dV(k)dV(- k ) = 1. 

Now dV(— k) is a polynomial in k of the same degree as 
that of dV(k), and since the sum of their degrees is 0, 
each is a constant. Then d V(k) has for every k the same 
value that it has for k = 0, namely 1. 

Let 



-i 0 O' 


-i 0 0- 

Wi(fi = 

0 1 0- 

, Wt(f) = 

0 l 0 


.0 0 1. 


.0 0 1. 


be two elementary matrices of Type II which differ only 
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in the position of the indeterminate l in the main di 
agonal. Clearly 


-o 

1 

<r 


-i 

0 

0' 


-o 

1 

0" 


-1 

0 

0" 

1 

0 

0 


0 

1 

0 


1 

0 

0, 

= 

0 

l 

0 

.0 

0 

l. 


_0 

0 

1_ 


.0 

0 

1. 


.0 

0 

1. 


so that we have 

d{U)-dW x (I)-d{U) = dW 2 (l). 

Since d(U) — ± 1, dWi(l) = tf 2 W(/). Hence we may de- 
note dW t (l) by dW(l). 

It is easily seen that 


-1 

0 

0' 


-1 

0 

<r 


“1 

0 

O' 


-l 

0 

O' 

0 

1 

0 


0 

l 

0 


0 

1 

0 

= 

0 

l 

0 

.0 

0 

1. 


.0 

0 

1. 


.0 

0 

/_ 


.0 

0 

/. 


Hence the valuation of the scalar matrix Si is the nth 
power of the valuation of W{1 ). When Z = 0 we have 

[dW'XO)]” = d( o) = o, 

so that dW(0) = 0. 

Now dW(l) cannot vanish for Ij* 0, for 
W(l)-W{\/1) - /, 
dir(O dpr(i/o - l. 

Upon equating coefficients we can show that dW{l) has 
the form c/*, where c is in F. But dW(l) = 1 when / = 1 
so that c= 1. 

We have shown that 

d(XJ) - ± 1, dV(k ) - 1, dlF(Z) = l\ i > 0, 

= dW(f)-d(X) = ftf(X). 

Now W(/) • X is the matrix X with one of its rows multi- 
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plied by l. Thus d(X) is homogeneous of degree i in the 
elements of each row of X. 

It is impossible to prove that i — 1 from (17) alone, 
for any power of d(X) will satisfy (17) if d(X) does. But 
we are seeking the valuation function of lowest degree, 
and it will yield the smallest consistent value of i in 
dW(l) above. Our investigation from this point on will 
have to be tentative. We shall assume that dW(l)=*l 
and attempt under this assumption to find a d(X) satis- 
fying (17). If this proves to be impossible, we shall have 
to return to this point and assume a larger value for t. 

We may now settle the ambiguity in the case of 
d(U). From the equality 



it follows that 

dV x (- 1 )dV i (l)dV 1 (- l) d(U) « dW(- 1). 

Since dV( k) = 1 and dW(/) =/, 

d(U) = - 1. 

We have now proved, subject to the assumption 
dW(l) =/ which will eventually be removed: 

Theorem 26. Let U, V, W(l) be elementary matrices 
of Types I, III and 77, respectively. Then 

d(U) = - 1, d(V) = 1, dW(I) - /. 

The last equation also holds when /=0. 
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19. Determinants. We have the result that 

d[W(l)-X] = dW(l)d(X ) = ld(X). 

Now W(l) ‘ X is the matrix X with one of its rows multi- 
plied by /. Thus d(X) is homogeneous and linear in the 
elements of each row of X . Moreover, 

d(UX) = d(U) d{X) = - d(X) t 

so that d(X) merely changes sign when two rows are per- 
muted. 

Weierstrass defined a determinant as a polynomial in 
the elements x ra of an nXn array X which is linear and 
homogeneous in the elements of each row, which merely 
changes sign when two rows are permuted, and which 
reduces to 1 when X = I. We have just proved that our 
function d(X) has these properties so that we can iden- 
tify the determinant of X with the valuation d(X), and 
use the method of Weierstrass to obtain its explicit form. 

Let us first consider the case where n =3. Since d(X) 
is linear and homogeneous in the elements of each row, 
it has the form 

3 

d{X) = X] CijkXuXtjXak, 

the € w * being numbers of F. Let X'= TJX be obtained 
from X by the interchange of two rows, say the first 
two. Then 

d(X ) + d{X') = 0= £ {u* + tjidxuxvxtk. 

Since the terms of this sum are linearly independent, it 
must be true that 


In general, if two coefficients are such that one is ob> 
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tainablc from the other by a single transposition of sub- 
scripts, they are negatives of each other. It then follows 
that €„a is 0 unless the subscripts arc distinct, provided 
F is not of characteristic 2; for if i =j, 6 ,/a = — 6 „a im- 
plies 26,,* = 0. 

When X reduces to /, d(X) reduces to 6123, which is 
therefore equal to 1. Hence e t] k is equal to 1 or —1 ac- 
cording as 



is an even or an odd permutation, and d(X) is com- 
pletely defined. It remains to be shown, however, that 
it meets our Requirement (17). 

The same situation holds for any value of n. If X is 
an nXn matrix with indeterminate elements, we define 

d(X) = Z) (- 1) ** 1 * 1 * 2 *, • • • 


where the summation extends over all permutations 


/I 2 • • • n 

\hx kt • • • h n 


) 


and A is a number of transpositions into which this 
permutation can be factored. Although h is not unique, 
it is shown in finite group theory that the oddness or 
evenness of h is unique.* Thus d(X) is well defined. 

If A =(a r ») is an nXn matrix with elements in F , we 
define d(A) to be the functional value of d(X) under the 
substitution #„•— **»/. 

We shall now show that d{X) meets our Require- 
ment (17). 

* See the author’s Introduction to Abstract Algebra , Wiley and 
Sons, 1940, p. 72. 
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Theorem 27. If A and B are two matrices , 
d(AB) - d(A) d(B). 

Consider a polynomial /(Jf) in the elements x r » of an 
nX« array X which is linear and homogeneous in the 
elements of each row, and which merely changes sign 
when two rows are permuted. We proceed just as in the 
derivation of the explicit form of d(X) from the Weier- 
strass definition, except that we conclude that €*•# 
= ±/(J). Thus f{X)-d(X) ■/(/). 

Let X = (x r ,) and Y = (y „) be two matrices whose 
elements are independent indeterminates, and form 
their product 

XY = (£ XriVi.). 

* 

Now d(X Y) is a linear homogeneous polynomial in the 
indeterminates x T ,. A permutation of the rows of X 
permutes the corresponding rows of XY; thus d(XY) 
merely changes sign when two rows of X are permuted. 
Thus 


d(XY) - d(X) K 

where K is the value of d(X Y) for X-+I. Then 
d(XY) * d(X) d(Y). 

If A and B are two matrices with elements in F, we 
have 

d(AB) - d(A) d(B). 

20. The adjoint. As we have seen, d(X) merely 
changes in sign when two rows of X are permuted. 
Therefore d(A) =«0 if A has two equal rows; for the 
permutation of these equal rows changes d(A) into its 
negative, and yet obviously does not change d(A) at all. 
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Hence* 

d(A) = - d(A), 2 d(A) = 0 

and, if 2^0, d(A) =0. 

We may write 

n 

d(X) = *.*, 53(— •• • *,-j,* t _„af H i.*,+r • • *«»« 

A ,=* 1 

where the second summation extends over all permuta- 
tions of the integers 1, 2, • • • , n with h t left out. This 
second summation depends upon both i and A„ and 
we may denote it by X xhl and write 

(18) d(X) = £ x„X ti . 

We shall call X t j the cofactor of x tJ in X . 

It is important to note that X tJ is independent of the 
indeterminates xn, x t2f ■ • • , x tn composing the ith row 
of X , and also of the indeterminates X\ 1x x 2 j, • • • , x n j 
composing the jth column of X. Thus the n 2 cofactors 
X tj have no common factor of degree >0. 

Let us replace in X the elements x x \ , x l2 , • • • , x tn of 
the ith row by the elements x/a, xk 2 , • • • , Xkn of the kth 
row, where kj*i, and call the new matrix X'. Then X f 
will have two equal rows so that d(X')=0. This sub- 
stitution will have no effect upon the cofactors Xn, 
X t2 , • • • , Xi n of the elements of the ith row of X . Hence 

d(X’) = £ x kj Xii = 0 k*u 

y-i 

* If F is of characteristic 2 the argument has to be modified but 
the result is still true. 
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We can combine this equation with (18) to obtain 

(19) £ x kj X t} = 8 kt d(X) 

/-I 

where Ski is Kroneckcr’s delta. 

The matrix 


adj X = ( X . r ), 

where r denotes the row and 5 the column in which X %r 
stands, is called the adjoint of X . The equation (19) 
above may now be written in matric notation as 

X-adj X = d(X) I. 


Since d(X) is not identically 0, we have 


adj 


That is, adj X/d(X) is the inverse AT"” 1 of X . Since 
adj X X = <*(*)•/, 


or 

( 20 ) T,XiLXji = 8 ki d(X). 

i-l 

Relations (19) and (20) arc of basic importance in 
determinant theory. We shall use them to prove two 
well-known theorems. 

Theorem 28. Cramer's Rule. Let 

anxx + < 1 , 2*2 + • • • + a ln x n = (i - 1, 2, • • • , n) 

a system of non-homogeneous equations whose matrix 
A = (a r «) is non-singular. Let II k be the matrix derived 
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from A by replacing its kth column by Ci, c 2t • • • , c„. The 
only solution of the system of equations is 

d(H0/d(A ), d(H 2 )/d(A), • • • , d(H n )/d(A). 

If (*/ , x{ t • • • t xj) is a solution of the given sys- 
tem, then 


yi dxjXj — c%. 

Let -4,* be the cofactor of a t k in the matrix A. Then 

n n n 

^ ^ A %k O'ijXj = ^ 1 

»— i y-i t — i 

Z[" Z ^ x'j = Z'LifeC.. 
j-i L i—i J *=i 

If in the sum Z^»* c * we replace c x by a,*, we obtain 
d(A). Thus 

Z^» = <*(#*) 

is the determinant of the matrix ii* obtained from A by 
replacing the kth column by the c' s. Making use of (20), 
we have 


IMW '* i - d(A)-x{ = d(H h ). 

/-I 

If d(A)^0, our assumed solution can be nothing other 
than 


{d{H x )/d{A)9 d(H 2 )/d(A) } • • • , d(H n )/d(A)). 

That this is actually a solution can be established by 
substituting it into the given equations. 

It should be noted that this method of solving sys- 
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terns of equations is not fundamentally different from 
the method described in §1 and §2. The choice of the 
cofactors A ,* for the k's of Theorem 1 produces an equa- 
tion all of whose coefficients vanish except the coeffi- 
cient of Xk. 

Another useful and familiar theorem is 
Theorem 29. Let 

d%\Xi + a x 2 X 2 + * • • + a xn x n = 0 (i = 1, 2, • • • , n) 

be a system of homogeneous equations whose matrix 
A = (a r# ) is singular . For every value of k, 

(Ak 1, Ak2 , * * ' • Akn ) 

is a solution . 

From (19) we haVe 

n 

X) dijAkj = &%kd(A) = 0, 

which proves the theorem. 

Corollary 29. Let 

anx 1 + a< 2*2 + • • • + a»n#n = 0 (i = 1, 2, • • • , n — 1) 

be a system of n — \ homogeneous equations in n unknowns . 
Let hi be ( — 1) <+1 times the determinant of the matrix ob- 
tained by suppressing the ith column of A, Then 

(h u h i9 • • • , h n ) 

is a solution . 

If we adjoin to the given system another equation 
all of whose coefficients are 0 and apply the theorem 
with k*=n, we obtain the corollary. We shall see in §23 
that the solution is trivial unless the rank of the matrix 
of the coefficients is » — 1. 
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21. Properties of determinants. The literature on 
determinants is vast and well known, and determinants 
are not the primary objective of this book. We shall 
therefore be content with the precise statement of a few 
of the principal results with an indication of their proofs. 

In finding the determinants of the elementary 
matrices, we have proved three well-known theorems. 
Let U be the unit matrix I with its ith and jth rows 
interchanged. If A is any matrix, B = UA differs from A 
only in having the ith and jth rows interchanged, and 
B* ~ A U differs from A only in having the ith and 7th 
columns interchanged. But <i( Z7) = — 1, so that 

d(B) = - d(A), d(B') = - d(A). 

Thus we have 

Theorem 30. If the matrix B is obtained from A by 
the interchange of two rows or of two columns , d(B) 
= -d(A). 

Corollary 30. If two rows or two columns of A are 
identical , d{A) =0. 

If F is of characteristic 2, a special proof is required, 
but the corollary holds for every field. 

Let W be the elementary matrix obtained from I by 
replacing the 1 in the (i, i) position by /. Then WA is 
obtainable from A by multiplying each element of the 
ith row by /, and A W is obtainable from A by multiply- 
ing each element of the ith column by /. But d(W) =/, 
so we have : 

Theorem 31. If B is obtained from A by multiplying 
each element of any row or any column of A by /, then 
d(B)=ld(A). 

Let Vu be the elementary matrix obtained from I by 
replacing the 0 in row i and column j by k . Then 
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B = is obtainable from A by adding to each ele- 
ment of the ith row k times the corresponding element 
of the 7 th row. But d{ V XJ ) = 1, so that B and A have the 
same determinant. Similarly B' =AV tJ is obtainable 
from A by adding to each element of the 7 th column k 
times the corresponding element of the ith column, and 
similarly A and B f have the same determinant. By a 
repetition of the argument, we have: 

Theorem 32. If B is obtained from A by adding to 
any row {or column) a linear combination of the other rows 
{or columns ), then d{B) =d{A). 

These results are also readily obtainable from the 
definition 

d{X) = (*“ * " * Xnin, 

where A is a number of transpositions into which the 
substitution 

/I 2 . • • n 

s = ( . . 

\ll *2 * ' * *n 

can be factored. In fact, all of the well-known determin- 
ant theorems come directly from this expression and the 
elementary theorems of finite group theory. Thus if X T 
is the transpose of X , 

d(X T ) = 22 (- • • • Xi„ n 

“ 22 (“ 1)* XliiX 2 i a • • • Xnin 

where h' is a number of transpositions into which the 
inverse of the substitution s can be factored. But h and 
W are both even or both odd so that we have : 

Theorem 33. d,(A T )=d(A). 

Now let us replace the elements of the kth row of X 
by the sums 
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ykl + ZkU yki + Zjfc2» • • • i ykn + Zkn , 

denoting the new matrix by X Denote by Y the matrix 
obtained by replacing the Ath row of X merely by the 
y' s, and denote by Z the matrix obtained by replacing 
the Ath row of X merely by the z's. Then 

d(X ') = X) ("* • * • (y*U + Z kil) • • “ *»*» 

— y! ( • • • ykik ■ * * ^»ti» 

+ iC (““ 1) A ^1*1 ' • • z ki k • • • Xnin 

- <*(F) + d(Z). 

We have therefore proved 

Theorem 34. If each element a k , of the kth row of A is 
a sum 

bt» i + b k , 2 +•••+** «n 

then 

d(A) = d(>li) + d(-4s) + ■ ■ • + d(i4 m ) 

wAero i4< diners /row 4 on/y in /Ao/ /Ae elements in the 
kth row are bk„for s = 1, 2, • • • , n. A similar result holds 
for columns. 

22. Minors and cofactors. From the array 

P Oil O 12 - • Oin"l 

, Oai 022 • • • 02» I 


LOml OmS ’ * * O mn J 

one can select certain rows and columns and thereby 
determine subarrays or minors. Thus if we select the ele- 
ments common to the rows 3, 1, 4 and the columns 5, 3, 
we have the minor 
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”#35 

033 * 

*314 



A 63 — 

015 

013 


-045 

043 - 


If A T g £. .' ,‘IJ is square, its determinant is a minor determi- 
nant of A . 

The following theorem is due to Laplace: 

Theorem 35. 

d(A) = e± 

where the summation is over the n Ck ways of selecting the 
k numbers i\, it, • • • , ik from among the numbers 
1, 2, • • • , n without regard for order, and the sign is + or 
— according as the substitution 

( r\ r 2 • • • r n \ 

i\ if • • in/ 

is even or odd. 

The proof of this theorem is based upon the ele- 
mentary fact that for every value of k all permutations 
of 1 , 2, • • • , n are obtained by separating 1 , 2, • • • , n 
into two unordered sets i\, it, • • • , ik and i*+i, • • • , in in 
all possible ways, and then permuting each set in all 
possible ways. Details of the proof will be omitted. 

23. Rank. We originally defined the row rank of a 
matrix A to be the number of linearly independent row 
vectors in the matrix (§11), and the column rank to be 
the number of linearly independent column vectors. 
Later (§14) these two numbers were shown to be equal, 
and were called the rank of the matrix. 

Let p denote the order of a non-singular minor of A 
of maximum order. We shall now show that p = r, so that 
the rank of a matrix may be thus defined. 
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We assume that the column vectors 


Vi = fl 2 », • • • , o u ) (t = 1, 2 ,•••») 

of the matrix 



flu 

fll2 * • * 

flln"" 

A = 

fl21 

fl 2 2 ' * ' 

fl 2n 


-All 

fljfe2 * * * 

flU_ 


span a linear space of rank r. If there exists a non- 
singular minor of order p, we may assume the notation 
to be so chosen that 



All 

Al2 * 

' ’ Alp 

B = 

A21 

fl22 ‘ 

• • at/, 


_Apl 

Ap 2 * 

• • a PC- 


is non-singular of determinant b. The vectors v 2 , • • • , 
v p are linearly independent, for a linear relation among 
them would imply a linear relation among the columns 
of J3, and would therefore imply 6=0. Thus r*£p. 

If p = n, then rgp and consequently r=p. If p<w, 
then every matrix 


Bih 


flu • • • flip flu 


flpi • • • flpp dph 


(h — p + 1, • • • , n; 

l = 1, 2, • • • , k) 


Lflji • • • Oi, 


is singular. For Z;£p, Bu has two equal rows. For l>p, 
Bih is a minor of A of order p-f- 1 and hence is singular by 
assumption. The cofactor of the element aj, of the last 
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row of Bn, is independent of Z, and we may denote it by 
Ah»- Then by (19) 

Ah\(in + Ah2<ii2 + • • * + Ahfidip + Ahhdih = 0 . 

Since this holds for / = 1 , 2, • ■ • , k, 

Ah\V\ + At, 2V2 + • • • + AhpV p + AhhVh = 0 . 

But Ahh—by^ 0 so that every vectors, A=p + 1, • • • , n, 
is linearly dependent upon the vectors v\, v«, • • • , v p . 
Thus in every case r^gp, and consequently r=p. 
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MATRIC POLYNOMIALS 

24. Ring with unit element. In modern abstract al- 
gebra it is customary to define a ri ig as a mathematical 
system consisting of elements and two operations called 
addition and multiplication, relative to which the sys- 
tem is closed, subject to the following postulates. 

1. The system is a commutative group relative to 
addition, the identity element being denoted by 0, and 
the inverse of a by —a. 

2. Multiplication is associative, i.e., 

(a X b) X c = a X (6 X c). 

3. Multiplication is distributive with respect to addi- 
tion, i.e., 

aX(b + c) = aXb + aXc, 

(b + c)Xa = bXa + cX a. 

Most rings which are of importance in mathematics 
are known as rings with unit element . In addition to the 
ring postulates they are subject to another postulate, 
namely 

4. There exists an identity element 1 of multiplica- 
tion such that 

flXl=lXa=a 

for every element a of the system. 

The instances of ring with unit element are numer- 
ous and important. The rational numbers, the real num- 
bers, the complex numbers, and in fact all fields are in- 
stances. So are the rational integers, the integers of an 

67 
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algebraic field, />- adic integers, etc. All of these rings 
possess commutative multiplication. Linear algebras 
which possess a unit element, and their sets of integral 
numbers, furnish instances of rings with unit element 
which are ordinarily not commutative. 

Our particular interest in such rings lies in the fact 
that all nXn matrices whose elements lie in a ring with 
unit element themselves constitute a ring with unit ele- 
ment. This is not true of any more specialized mathe- 
matical system. That is, the coefficient ring in which the 
elements of the matrices lie may be commutative with- 
out the matric ring being commutative, and the 
coefficient ring may have the property that every non- 
zero element has an inverse with respect to multiplica- 
tion without imparting this property to the matric ring. 

25. Polynomial domains. Let us denote by R a ring 
with unit element 1. Let .r be an indeterminate over this 
ring. The set of all polynomials in x with coefficients in 
R constitute a mathematical system R[x] which is like- 
wise a ring with unit element. This is called the poly- 
nomial domain of R . If R is commutative, so is !?[*]. 

It may be well to recall a few of the properties of in- 
determinates.* If 

f(x ) = a 0 + a\x + a 2 x 2 , g(x) = bo + b\X + b 2 x 2 
are two elements of 2?[x], their sum is 
f(x ) + g(x) = (fl 0 + bo) + (<*i + bi)x + (<*2 + b 2 )x 2 , 
and their product is 

f(x)-g(x) = 0o&o + (a x bo + a 0 bx)x + ( 02^0 + 0\b\ + Oob 2 )x 2 
+ (oik + a 2 bi)x* + a 2 b 2 x\ 

* See the author's An Introduction to Abstract Algebra , Wiley, 
1940, p. 158. 
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The indeterminate x is commutative with every number 
of R even when two numbers of R are not commutative 
with each other. 

Let us for a moment suppose that R is commutative. 
Then f(c ) is a unique number of 1?, namely the number 

do + d\C + die 2 

obtained by substituting c for the indeterminate x in the 
polynomial /(*). From the equalities 

/(*) + g(x) = Kx), /(*)•«(*) = *(*) 

in -R[*] follow the equalities 

/(c) + g(c) = k(c), f(c)i(c) = k(c) 

in R. 

But if R is not commutative, it is not usually per- 
missible to substitute a number of R for the indeter- 
minate in every equality in R [*]. Thus in R [x] 

/(*) = o 0 + «i* + a ** 2 ** flo + *Oi + **«* 

= O 0 + OiX + XO t X = O 0 + XOi + XOtX 

= Oo + d\X + **oj = flo "t* xai + i c*a*. 

The six numbers of R obtained by substituting a num- 
ber c of R for x could well all be different. Thus let us 
take for R the ring of all two-rowed square matrices with 
rational elements. Let us take forf(x) the polynomial 

«-C M 3- 


If we choose 
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then 


c a-c a- -c a-G a- 


Clearly the functional value /(c) is not well defined. 

Among all the possible functional values of a poly- 
nomial, two are of outstanding importance. Thus if 


f(x) = a 0 + a ix + a 2 x 2 + • • • + a n x n 9 


we define the left functional value 

fife) = a 0 + a x c + a 2 c 2 + • • • + a n c n 


to be the one where all coefficients are to the left of the 
powers of c f and similarly we define the right functional 
value 9 

fn(c) = a 0 + ca x + c 2 a 2 + • • • + c n a n 

where all the coefficients are on the right. 

Even the restriction to left functional values does 
not immediately clear away the difficulties. Thus if 

/(*) = flo + a x x, g(x) = &o + bix f 
f(x)-g(x) = h(x) = a 0 b 0 + (a 0 6i + a x b 0 )x + a x b\x 2 , 
fL(c)-g L (c) » a 0 6 0 + aob\C + a x cb 0 + a x cb x c % 
which is in general different from 

Al(c) = a 0 6o + (a 0 bi + a x bo)c + a x b x c 2 . 

It is hoped that the reader is now ready to admit that 
any orderly theorem which can be extracted from this 
chaos is worthy of consideration. The following theorem 
should not now appear trivial. That it is far from trivial 
will appear from its numerous applications. 

Theorem 36. Let R be a ring with unit element , and 
let R[x] be its polynomial domain , Letf(x) g(x) =h(x) in 
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/?[*]. U &l( c ) =0 ™ ti ien *l(g) =0. If fn(c) =0 in R, 
then hn(c) =0. 

We can show the method of proof as conclusively by 
using the polynomials given at the beginning of this sec- 
tion as in the general case. We have 

!il(c) = a 0 bo + (tfi&o + a 0 bi)c + (<z 2 6 0 + «i*i + aob 2 )c 2 
+ (fli&2 4" o 2 bi)c 2 + C/262C 4 
= <*<rg L (c) + a v giic)c + a 2 gz.(c)-c 2 . 

If 4 l( 0 =0, then clearly h L (c) = 0. 

26 . Degree of a polynomial. A number a of a ring R 
is called a right divisor of zero if there exists another ele- 
ment a' 9*0 of the ring such that a'a=0. According to 
this definition, 0 is a divisor of zero. A number a 9*0 
which is a right divisor of zero is called a proper right 
divisor of zero. 

If a has a right inverse, it cannot be a right divisor 
of zero. For a'a = 0 would imply 

(a'a)ar l = a'(aa -1 ) = a' = 0. 

Similarly if a has a left inverse, it cannot be a left divisor 
of zero. 

The concept of degree of a polynomial with coeffi- 
cients in a ring with unit element is complicated not by 
the failure of the commutative law, which caused the 
difficulty in the last paragraph, but by the possible exist- 
ence of divisors of zero. We shall define the polynomial 

f(x) = a 0 -f aix + • • • + a w - ix n ~ l + a n x n 

to be of degree n if a n 9 * 0, to be of degree n — 1 if a»=0 
but a n _i5*0, etc. If all the coefficients are 0,f(x) has no 
degree. But to be useful this concept must be strength- 
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ened. A polynomial is of proper degree n if it is of degree 
n and if a n has an inverse a n ~ l such that 

- 1 -i - 

a n o n = o n a n = 1. 

In particular this will be the case if -R is a ring of square 
matrices and if a n is non-singular (§13). Clearly a n can- 
not be a divisor of zero. If but does not have a 

double inverse, f(x) has no proper degree. 

If f(x) is of degree n and 

g(x) = bo + b\X +•••+■ b m x m 

is of degree m, then 

f(x)g(x) =* a Q bo + • • • + a n b m x m+H . 

Thus the degree of a product of two polynomials never 
exceeds the sum of their degrees. If f{x) is of proper de- 
gree n, or if g{x) is of proper degree m, then f(x) • g(x) is 
of degree exactly m+n. If both/(x) and g(x) are proper, 
so is their product, for 

bmOn = (Onbtn) \ 

The following theorem is the analogue of the familiar 
“division algorithm . n 

Theorem 37. If f{x) and g{x) are polynomials of 
J?[x], and if g(x) is of proper degree /, then unique poly- 
nomials r(x) and rj(x) exist, each being 0 or of degree < l , 
and also unique polynomials q(x) and qi(x), such that 

f(x) « q(x) g(x) + r(x), /(*) * g(*)ji(*) + rrfx). 

The proof is quite analogous to the proof of its ele- 
mentary-algebra counterpart. Suppose, for instance, 
that ’ 
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f(x) = Qq + a\x + a 2 x 2 + o 2 x z t g(x) = b 0 + b\x + b 2 x\ 

Since g(x) is assumed to be of proper degree 2, b 2 ~ l exists 
in j R. Then 

/(x) - a 2 b 2 1 x-g(x) 

is 0 or is a polynomial of degree at most 2 which we may 
denote by do+dix+d^ 2 . Then 

/(x) - a z b 2 l x g(x) - d 2 b 2 l g(x) - r(x) 

is 0 or of degree <2, and 

/(*) = (a 2 b 2 l x + d 2 b 2 )g{x) + r(x), 

thus establishing the existence of the q(x ) and r(x) de- 
scribed in the theorem. By writing 

f(x) = a 0 + Xdi + x 2 d 2 + x z d 2 

and working similarly on the other side, we can show the 
existence of q\(x) and ri(x). The method is quite general. 
Suppose that 

f(x) = q(x)-g(x) + r(x) = q 2 (x)-g(x) + r 2 (x) 

where r (x) is either 0 or of degree less than the degree / 
of g(*), and similarly for r 2 (x). Then 

(q(x) - qt(x)) g(x) = ft(x) - r(x). 

Unless q(x) —q 2 (x) is 0, the left member of this equation 
is of degree §£/, whereas the right member is 0 or of de- 
gree </. Hence 

q(x) — qM = 0, r 2 (x) - r(x) = 0, 

and the uniqueness of the q(x) and r(x) is proved. 

Theorem 38. If there exist numbers c and d of R such 
that gL(c) =0, g R (d)=0 where g(x) is described in Theo- 
rem 37, then 
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flic ) = r L (c), f R (d) = fuzO*), 

respectively. 

By Theorem 37, 

*(*) = f(x) - r(x) = q(x) g(x). 

By Theorem 36, g£,(c)=0 implies h L (c) — 0. But AlW 
—flic) —rife). The second part follows similarly. 

As a special case of this theorem we have the Re- 
mainder Theorem : 

Corollary 38. If r is the remainder obtained by di- 
viding f(x) on the right by x—c } then r=f L (c). If r\ is the 
remainder obtained by dividing f{x) on the left by x—c, 
then r\ —fu{c). » 

Evidently x—c is proper of degree 1 so that the r 
and ri of Theorem 38 are in R. If g(x) —x—c, then both 
gL(c) and gn(c) vanish. 

27. Matrices with polynomial elements. If R is a 
ring with unit element, its polynomial domain /£[#] is 
likewise a ring with unit element. Let us consider a ma- 
trix with elements in R[x] such as 

r2x* + x 2 — x + 1 x 2 + 2x + 21 
L 3x 2 -x x*+x + 3J 

By the definitions of addition and scalar multiplication 
for matrices it is clear that 


r2 01 p n r-i 2 i r 1 2 i 

A -l .r+h o]’’ + [-i ,]*+[» J- 


Here x is an indeterminate defined over the ring R . 

On the other hand, one can look upon the last ex- 
pression for as a polynomial whose coefficients are 
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matrices with elements in R, In that case x must be con- 
sidered as an indeterminate defined over the ring of two- 
rowed matrices with elements in R . Clearly it makes no 
difference which of these points of view we take. That is, 
the polynomial domain of the total matric algebra of 
order n 2 over R is isomorphic with the total matric al- 
gebra of order n 2 over R[x], 

28. The characteristic function. Let A=(a„) be an 
nXn matrix with elements in a field F. It is evident that 
A satisfies an equation of degree ^« 2 with coefficients in 
F. For the condition 

olqI “H cl\A -f- cl<lA 2 -f- • • • -{- a n \A "* = 0 

is equivalent to n 2 equations in the n 2 + l unknown co- 
efficients, and such a system of equations always has a 
non-trivial solution, by Corollary 4. 

But a much stronger relation than this holds, for the 
matrix A actually satisfies an equation of degree 
with coefficients in F. This fact is not at all obvious. 

The matrix Ix—A is non-singular in the field of all 
rational functions of x , for its rows are linearly inde- 
pendent. Thus for n = 3, the row- vectors 

Vi = (x — an, — an, — an), 

® “ 021t x — a22> — 023 )» 

Vj = ( — asi, — as 2 , x — ass) 

are independent. For if there were a linear relation 

CiVi + C 2 V 2 + Civ 8 = 0, 

it could be assumed without loss of generality that 
Cu C 2 , c% were polynomials. If not all of them were zero, 
we could choose one, say ct, which was of maximal de- 
gree d. Then 
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Ci(x — 022 ) = C 1012 + ^ 3032 * 

But the left member of this equation is of degree d + 1, 
while the right member is of degree at most d. Thus we 
are forced to conclude that Ci = £2 =£3 = 0, and that v \ , V2 , 
Vi are linearly independent. 

The linear matric polynomial Ix—A can be written 
as a matrix whose elements are constants or linear poly- 
nomials in the indeterminate x . This matrix has an ad- 
joint which is a matrix whose non-zero elements are 
polynomials in x of degree n~l or less. This adjoint 
can now be written in the form 

adj ( lx — A) = Co + C\x + • • • + C„_ix n_1 

where the coefficients are n Xn matrices with elements in 
F. By (20) 

( 21 ) adj ( lx — A) {lx — A) = d{Ix — A) I . 

The determinant d{Ix — A) is a polynomial of degree n 
with coefficients in F, which we shall denote by/(x) and 
call the characteristic function of A. Then f(x) • I may also 
be regarded as a polynomial in x with coefficients in the 
ring of nXn matrices with elements in F. Evidently the 
matric polynomial Ix—A becomes 0 when A is substi- 
tuted for x, so by Theorem 36 , the polynomial 

d(Ix-A) I - /(*)•/ 

also vanishes when x is replaced by A. Since /(x) has 
scalar coefficients, the left and right functional values 
are the same. Since /(x) • I may also be regarded as a 
scalar matrix whose diagonal elements are the poly- 
nomials /(x), it follows from f{A) 7=0 that f(A) =0. 
We have therefore proved the important Hamilton- 
Cayley Theorem : 
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Theorem 39. Every matrix A satisfies its character - 
is tic equation 

f(x) = d(Ix - A) =0. 

29. The minimum function. Since every nXn matrix 
A with elements in F satisfies an equation with coeffi- 
cients in F> it is evident that it satisfies such an equation 
of lowest possible degree. The degree /x of such an equa- 
tion is called the index of A. The Hamilton-Cayley The- 
orem tells us that n^n. If /x<w, A is called derogatory . 
That derogatory matrices exist is demonstrated by the 
fact that scalar matrices satisfy equations of the first 
degree. 

Now let 

m{x) = x* + mix'*" 1 + • • • + m M = 0 

be an equation with coefficients in F of minimum degree 
H satisfied by the matrix A . The function m(x) has been 
chosen with leading coefficient equal to 1. If g(x) =0 is 
any equation with coefficients in F such that g(A) = 0, 
we can write 

g(x) = q(x)m(x) + r(x) 

where r(x) is 0 or of degree </jl. But g(A) =0 and m(A) 
= 0, so that r(i4)=0 by Theorem 38. Since no poly- 
nomial of degree <jx such that r(A) =0 exists, it follows 
that r(x)=0 and m(x) is a divisor of g(x ). We have 
proved 

Theorem 40. The minimum function of a matrix A 
with elements in F is unique up to a non-zero constant fac- 
tor. If g(*) is any polynomial with coefficients in F such 
that g(A) =0, then g(x) is divisible by the minimum func- 
tion of A . 

As we have seen, 
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adj (lx — A) (lx — A) = /(*)•/. 

If the elements of adj(/* — ^4) have a greatest common 
divisor of degree >0, this polynomial will be a divisor of 
every element of the left member of the above equation 
and will therefore be a divisor of f(x). We shall venture 
to define the quotient of adj (lx— A) by that greatest 
common divisor of its elements whose leading coefficient 
is 1 as the supplement of Ix—At and to write it as 
sup(Ja;— -4). Its elements are relatively prime poly- 
nomials in x t and it satisfies an equation 

(22) sup (lx - A)-(Ix - A) - h(x) I 

where h(x) is a divisor of the characteristic polynomial 
f(x) and has 1 as its leading coefficient. 

Theorem 41. The polynomial h(x) is the minimum 
function of A . 

The proof that h(A)=0 follows from (22) by the 
same argument that was used in the proof of Theorem 
39. Hence m(x) divides h(x) by Theorem 40, so that we 
may write 

h(x) =* k(x) m(x). 

Let g(x) be any polynomial with coefficients in F 
such that g(A) =0. By the division algorithm (Theorem 
37) there exists a matric polynomial P(x ) and a matrix 
Q with scalar elements such that 

g(x)-I= P(x) -( X I-A)+Q. 

By Corollary 38, @=g(i4)* J*0 so that 

g(x)I = P(x)(xI-A). 

Now m(x) is a polynomial which vanishes for x=*A. 
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Set 

(23) m(x) • / = P(x) • (xl — A). 

Then by (22) 

sup ( lx — A) • ( lx — A) = k(x) • P(x) • (Jx — A). 
Since lx— A is non-singular, 

sup (lx — A) — k(x)-P(x). 

That is, every element of the matrix sup(J#— A) is 
divisible by the scalar polynomial k{x). But the ele- 
ments of sup(/x— A) are relatively prime polynomials 
so that k(x) is a non-zero constant. Since both m{x) and 
h(x) have leading coefficient 1, they are equal. 

Theorem 42. The distinct factors of the characteristic 
function f{x) of A which are irreducible in F coincide with 
the distinct factors of the minimum function m(x) which 
are irreducible in F. 

Since m(x) divides f(x), every irreducible factor of 
m(x) occurs in/(*). From (23) we have, by taking the 
determinant of each member, 

[*»(*) ]” = d(P(x))f(x). 

Consequently each irreducible factor of f(x) occurs as a 
factor of m(x). 

Corollary 42. If the characteristic function of A is a 
product of distinct irreducible factors , then A is not derog- 
atory. 

30. The rank of a polynomial in a matrix. If A is a 
matrix of order n and rank r£n with minimum function 
m(x), the polynomial m(A) is of rank 0. The question 
arises whether there are other polynomials in A with 
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scalar coefficients whose ranks are less than n but greater 
than 0. 

Theorem 43. Let the matrix A have the minimum 
function m(x). Let f{x) be any polynomial with scalar co- 
efficients, and let d{x) be the greatest common divisor of 
m[x) and f(x). Then the rank of f(A) is equal to the rank 
ofd(A). 

Since d(x) is a divisor of /(*), there exists a poly- 
nomial £(*) such that f(x) =k(x) d(x). Then 

f(A) = k(A) d(A), 

whence it follows by Theorem 14 that the rank of f(A) 
is less than or equal to the rank of d(A). 

Since d(x) is a greatest common divisor of m( x) and 
/(*), there exist polyncynials p{x) and q(x) such that 

d(x) = p(x) -m(x) + q(x)f(x). 

Then, since m(A) =0, 

d(A) = q(A).f(A). 

Again from Theorem 14 it follows that the rank of d(A) 
is less than or equal to the rank of/(i4). Thus d(A) and 
f(A) have the same rank. 

Theorem 44. The matrix f{A) is singular if and only 
if /(*) has a factor of degree *£ 1 in common with the mini- 
mum function of A . 

Let d(x) be any divisor of degree ^ 1 of the minimum 
function m(x) of A. Let m(x) =£(*) d(x). Then 

0 = m(A) = k(A)d(A). 

Rut k(A) is of rank greater than 0, for k(x) is of lower de- 
gree than the minimum function m(x). If d(A) were non- 

• See, for instance, L. Wcisncr, Introduction to the Theory of 
Equations , Macmillan, 1938, p. 30. 



MATRIX HAVING A GIVEN MINIMUM FUNCTION 81 


singular, the product k(A)d(A) would likewise be of 
rank greater than 0, by Theorem 17. Thus d(A) is singu- 
lar. 

llf[x) has a greatest common divisor d(x) of degree 
^1 in common with m(a;), then/(i4) has the same rank 
as d(A) by Theorem 43, and hence is singular. Uf(x) is 
relatively prime to tn(x), their greatest common divisor 
is d(x) = 1, so that d(A) =/, which is of rank n. Hence 
f(A) is non-singular. 

31. Matrix having a given minimum function. Let 

m(x) = a 0 + aix + a 2 x 2 + x l 

be a given polynomial with coefficients in a field. We 
have assumed that the degree of the polynomial is 3, but 
the argument applies equally well to a polynomial of any 
degree. It is no restriction to assume that the leading 
coefficient is 1. 

Consider the matrix 



"0 

0 

- <z<f 

A = 

1 

0 



.0 

1 

- 02- 


which has the negatives of the coefficients of m(x) in the 
last column, l’s in the diagonal just below the main 
diagonal, and 0’s elsewhere. Evidently 



" X 

0 

00 

d(Ix - A) * 

- 1 

X 

01 


. 0 

- 1 

* + 02- 


If we add to the second row x times the third row, the 
determinant is not changed (Theorem 32). That is, 
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d(lx - A) 


x 0 a 0 

— 1 0 x 2 -I - Q 2 X -f- d\ 

0 — 1 X 4“ #2 


Now add to the first row x times the second row. 


d(lx - A) 


0 0 x 3 a^x 2 -T (i\x -f- do 

— 1 0 x 2 + a^x + ai 

0—1 x a* 


Now expand by the Laplace expansion in terms of the 
elements of the first row (Theorem 35). 

d(lx — A) = x z + a 2 * 2 + cliX + a 0 = m(x ). 

Thus the matrix A has ^m(x) as its characteristic poly- 
nomial. 

We may sec that m(x) is also the minimum poly- 
nomial of A . Evidently the (w — l)-rowed minor de- 
terminant in the lower left-hand corner of d(lx—A) is 
equal to ±1, so that the (« — l)-rowcd minor deter- 
minants are relatively prime polynomials. But these 
polynomials are the elements of adj(/x— A). It now fol- 
lows from (22) that 

adj (lx — A) = sup (lx — A) 

so that the minimum polynomial m(x) coincides with 
the characteristic polynomial f(x). 

We shall call A the companion matrix of its minimum 
equation. 

32. The norm. By setting a: = 0 in the equation 
d(lx - A) = f{x), 

we see that the constant term of the characteristic func- 
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tion f(x) multiplied by ( — l) w is equal to the determinant 
of ^L. If 


m(x) = x? + mix** 1 +•••+ nip 

is the minimum function of A, we shall define ( — 1 ) M m M 
to be the norm of A, written n(A). 

Theorem 45. The norm of A is 0 if and only if A is 
singular . 

For by Theorem 44 A is singular if and only if x 
divides m(x), in which case m ll = 0. 

Let us write the minimum function of a matrix A j*0 
as 


m{x) = x? + + • • • + m»-ix ± n(A). 

Since m(A) = 0, 

A (/4 m “ 1 + + • • • + w M _i) 

= (A * 1 * 1 + m\A /i ~ 2 + ■ • • + nift—^A = ± n(A ). 

The matrix in parenthesis cannot be 0 for m(x), the 
minimum polynomial, was of degree p. Hence if n(A) 
= 0, A is both a left and a right proper divisor of zero. 
If n(A)* 0, 

Ai^A ^ 1 -f- nt\ A /i ~ 2 • • • “h i) 

± n{A) 

so that 


A* 1 + m\A * 2 + • • • + 

A~ l '= 

± n(A) 

Hence A has a left and right inverse and cannot be a left 
or a right divisor of zero. We have therefore proved 
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Theorem 46. A matrix A ?*0 is a proper left and right 
divisor of zero if and only if it is singular . If A is singular , 
there exists a polynomial m x (x) of degree /z — 1 such that 

m\(A) A = A -m\{A) — 0. 

If A is non-singular , its inverse can be written as a poly- 
nomial in A . 

We illustrate with a numerical example. Let 
- 7 4 - 1“ 

A = 4 7 — 1 . 

_- 4 - 4 4. 

Let the characteristic function be 

9 

d(Ix — A) = f(x) = x* + a 2 x 2 + aix + «o. 

Then a 2 is the negative of the sum of the diagonal ele- 
ments, a\ is the sum of the two- rowed minor determi- 
nants which are symmetrically placed with respect to 
the principal diagonal, and a 0 is the negative of the 
determinant. That is, 

02 = — (7 + 7 + 4) = — 18, 

7-1 7-1 

+ -4 4 + -4 4 

7 4-1 

o, = - 4 7 - 1 = - 108. 

- 4 - 4 4 

Thus 

/(*) = s 3 - 18* 2 + 81 x - 108 - (* - 3) 2 (* - 12). 
Since f(x) has a repeated factor, it is worth while to 
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see if A is derogatory. The only possibility for m(x), 
other than to equal /(*), is 

m(x) = (x — 3)(x — 12), 

since m(x) and f(x) have the same distinct irreducible 
factors by Theorem 42. Now 



-4 4 -r 


r— 

rH 

1 

1 

I 

(A -31) (A -121) = 

4 4-1 


4 -5 -1 


1 

1 


.-4 -4 -8. 


so the minimum function is 

m(x ) = x 1 — 15* + 36 = x(x — 15) + 36. 
Then the inverse of A is 


A- 1 = 


A - 15/ 
- 36 



4 

- 8 
- 4 



We shall conclude the chapter with 

Corollary 46. ///(*) is relatively prime to the mini- 
mum Junction of A, then f (A) has an inverse f~ l (A) which 
is a polynomial in A . 

By Theorem 44/04) is non-singular. By Theorem 46 
its inverse can be written as a polynomial in f(A), and 
consequently as a polynomial in A. 



CHAPTER V 
UNION AND INTERSECTION 
33. Complementary spaces. Two vectors 

a = (<Il, 02i • • • t 0n), <t> = (l'l» V2j m • * , Vn) 

of the total vector space S of dimension « are said to be 
orthogonal if their inner product vanishes — that is, if 

a <t> = aiVi + a 2? 2 + • • • + a n v n = 0. 

If ^ is orthogonal to each of the vectors a and /3, then 

+ & 2 0 ) <b = k\CL'<t> + k2fi<t> = 0 , 

so that (/> is orthogonal to every vector of the form 
k\a+k 2 &. In general, if 4> is orthogonal to any number of 
Vectors a, 0, • • • , it is orthogonal to every vector of the 
linear system (§7) Si which they define. We then say 
that <t> is orthogonal to the linear system Su and write 

Si‘<f> = 0. 

Now consider the set Si of all vectors <f>, x> • • • 
which are orthogonal to the linear system Si. Then 

Si- (ki<t> + &2X +•••) — kiSi <l> + fc»Si‘X + • • • =0 

so that &i<£+* 2 X+ • • • is in S{ . Thus the set S{ is a 
linear system of vectors. We call Si the orthogonal com- 
plement * of Si. 

* When Si is a real Euclidean space, 5/ is its orthogonal comple- 
ment in the sense in which geometers use the term. But the reader 
should be warned not to look for a similar geometric interpretation 
in general. Thus if F is the complex field a vector such as (*, 0, 1) 
may be in both S\ and Si, 
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According to Theorem 8, Si has a basis of the form 

Ct\ = (an, 0, • • • , 0), <*2 = (021* 022. • • • , 0), 

• • • » OCn = (0nl, 0n2, * ' ' » 0»n) 

where it is understood that either aui * 0 or else on is not 
present. If x = (*i> * 2 , - • • , *n) is a vector of 5/, then 



011*1 

= o, 


0 21*1 + 022*2 

= o, 

(25) 

031*1 + 032*2 + 033*3 

= 0 , 


0»1*1 + 0n2*2 + ■ ■ ■ + 0nn*n = 0. 

If an^O, then *i=0. If x 2 is a multiple of X\. In 

general if a tt 5* 0, x t is a definite linear combination of 
Xi, x 2 , • • • , x x -\. On the other hand, if a t -, = 0, x x is com- 
pletely arbitrary. If Si is of rank r, exactly n — r equa- 
tions are missing, n — r of the x t are arbitrary, and the 
other * t f s are definite linear combinations of these. Thus 
the linear system S{ can be written 

X = **t0»i + + • * • + *«»-c0«»-r 

where the <f >' s are vectors whose components are fixed 
numbers of the field, while the s are arbitrary. More- 
over, the 0’s are linearly independent, since each <pi i has 
a 1 in the iy-th position and 0’s in each preceding posi- 
tion. Thus we have proved 

Theorem 47. If the linear system Si of rank r 9 is the 
orthogonal complement of the linear system Si of rank r, 
then 


r + Zzen. 
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We defined S{ to be the set of all vectors orthogonal 
to every vector of Si. It is now evident that Si is like- 
wise the orthogonal complement of Si . For clearly the 
orthogonal complement Si" of Si contains Si, and since 
Si and S" are both of rank n — r', they coincide. 

By way of illustration, let us suppose that Si has the 
basis 


a 2 - (2, 1,0,0), « 4 - (- 1. 2,3, 1). 

Equations (25) become 

2*i -f- #2 = 0, 

— *i 2*2 “f* 3*3 "f~ *4 — 0. 

Since and a 8 are missing, *i and x 8 are arbitrary, and 
* 2 = — 2x u *« = *i — 2x 2 — 3* 3 = 5*i — 3x s . 
Then 

X = (*1» *2 1 *3i * 4 ) (*1» 2*i, * 3 , 5*i ““ 3 * 3 ) 

- *i(l, - 2, 0, 5) + * 8 (0, 0, 1, - 3) 

= *101 + * 308 - 

The vector 0 8 has a 1 in the third place and 0's in the 
first two places. Clearly x = 0 implies *i=* 8 =!= 0, so that 
0i and 03 are linearly independent and form a basis for 
the space Si complementary to Si. Then Si is of rank 2. 

34. Linear homogeneous systems. The problem of 
finding the general solution of a system of linear homo- 
geneous equations is essentially the problem of finding 
the orthogonal complement of the linear system of vec- 
tors determined by the coefficients of the given equa- 
tions. Consider the system of equations 
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011*1 + 012*2 + * * • + 01»*n = 0, 
(26) 021*1 + 022*2 + * * * + 02 n*n = 0 t 


0*1*1 + 0*2*2 + ' * * + 0*n*n = 0, 

and denote by a t the vector 

(0.1, 0*2, • * * i 0»n). 

The vectors a \ , a 2 , • • • , <x* determine a linear vector 
system Si. The set of all vectors x = (*ii **,•••, *») 
which satisfy (26) constitute the orthogonal complement 
S{ of Si. 

If the matrix of the equations (26) is of rank r, then 
Si is of rank r, and Si is of rank n — r. The space S{ has 
a basis composed of w— r linearly independent vectors. 
Such a basis of S{ is called a fundamental system of solu- 
tions of the equations (26), for each vector of the basis is 
a solution of (26), and every solution of (26) is a linear 
combination of them. 

Directly from Corollary 9c of the Steinitz Replace- 
ment Theorem we have 

Theorem 48. A set of solutions of (26) is a funda- 
mental system provided it consists of n—r linearly inde- 
pendent solutions. 

Let A be any matrix of n columns of rank r. Its row 
vectors are the coefficients of a system of equations (26). 
Let X be a matrix whose row vectors constitute a funda- 
mental system of solutions of these equations. Then X 
is of rank »— r, and 

AX T - 0. 

We shall define a matrix X to be an orthogonal com - 
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plement of a matrix A if both A and X have n columns, 
if the sum of their ranks is w, and if AX T — 0. 

If A X T = 0, then XA T = 0, so that A is also an orthog- 
onal complement of X . 

The orthogonal complement is not unique, for if X 
spans Si , so does QX where Q is any non-singular 
matrix. Moreover, A may be replaced by PA where P 
is any non-singular matrix without disturbing the solu- 
tion matrix X. That is, the equations of (26) may be 
subjected to any elementary row transformation which 
may be useful in simplifying them. 

Consider the equations 

4* + 8 y+ 18 z + 7w = 0, 

4/+ lOz + w = 0, 

10* + 18 y + 40z + 17 w = 0, 

* + 7y + I7z + 3w = 0. 

We have 


A = 


4 8 18 7- 

0 4 10 1 

10 18 40 17 

1 7 17 3. 


This matrix is of rank 2, and it may readily be verified 
that 

x -\ 5 1 0 - 4 1 

L13 0 1 - 10J 

is an orthogonal complement. 

35. Union and intersection. Let us denote by S the 
total vector space of dimension — that is, rank — n. Thus 
5 consists of all vectors 
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a = (fli, a 2 , • • • , a n ) 

whose components are in F. We may choose for 5 a basis 

( 1 . 0 , • • ■ , 0 ), ( 0 , 1 , - • ■ , 0 ), • • • , ( 0 , 0 , • • • , 1 ), 

so that 5 is spanned by the row vectors of the identity 
matrix I. 

A linear subsystem or linear subspace Si of S is a linear 
system of vectors all of which belong to S . We write 
SiCS. If not every vector of 5 is in Si, Si is a proper 
subspace of S, and we write SiCS. If SiCS, it must be 
true that Si is of lower rank than S. For if they were of 
the same rank n, each would have a basis composed of n 
linearly independent vectors. Each vector of the basis 
of Si would equal a linear combination of the basic 
vectors of S. Hence by Corollary 9c, the basis of Si 
would be a basis for S, so that S and Si would coincide. 

If Si and S 2 are linear subspaces of S, we define their 
union SiWS 2 to be the set of all vectors ot+0 where a 
ranges over Si and 0 ranges over S 2 . The union is a 
linear subspace of S. For if 71 and 7 2 are any two vectors 
of SiUS 2 , 

7l = «1 + Pit 72 = Ct 2 + P 2 

where o?i and a 2 are in Si, and 0i and 02 are in S 2 . Then 
for any two numbers k and l of F, 

ky\ + ly 2 = ( kai + /a 2 ) + (kPi + V 2 )* 

Since Si is a linear space, kai+lot 2 is in Si, and since S* 
is a linear space, k0\+l02 is in S 2 . Thus kyi+lyt is in 
SiUS 2 . 

If the vectors cq, a*, • • • , a p span Si and the vectors 
0 i, 02 , • • • , 0 q span St, then the vectors 

Ctu « 2 , • • • , Ctp, 0U 02, • • • , 0q 
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span 5 iU 5 2 . If 5i is of rank r\ and 5 2 is of rank r 2 , the 
union contains at most ri+r 2 linearly independent vec- 
tors, and at least as many as the greater of r\ and ra. 
Thus we have 

Theorem 49. The rank {or dimension) of 5iU5 2 is 
at least as great as the greater of the ranks of Si and 5a, and 
at most as great as the sum of their ranks. 

If S\ and 5a are linear subspaces of 5, we define their 
intersection 5iH5 2 to be the set of all vectors which are in 
both 5i and 5a. The intersection is a linear subspace of 5. 
For if 71 and 72 arc any two vectors of 5iP\5 2f 

7 = *7i + *72 

is in 5i since 5i is a linear space, and 7 is in 5 2 since 5 2 is 
a linear space, so that 7 is in 5 iH5 2 . 

Let 5iH5a be of rank m, 5i of rank r u and 5 2 of rank 
r 2 . Let 5iH5a have a basis 

7i» 72, • • • , 7m. 

Since these vectors are linearly independent and are in 
both S\ and 5a, there exists a basis of Si of the form 

7l» 72, , 7m, ®m+l, , 

and there exists a basis of 5 2 of the form 


7l» 72, • • • , 7m, 0m+l, • • • f Prr 
This follows immediately from Corollary 9d. Thus 
r\ 2 | nt, r 2 ^ m. 

Evidently the vectors 

(27) 7l, 7*» • * • t 7rn, «m+l, • • • • «ri» flm+1, ' ' * , j8r, 

span the union 5 iU 5 2 . As a matter of fact, these vectors 
are linearly independent and hence constitute a basis for 
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the union. For suppose that there existed a linear rela- 
tion 


£l7l + • • • + k m y m + + • • • + kr x a n 

+ /m+ljftm+l + * * • + IrtPrt = 0. 

If some l t were not zero, then because of the linear inde- 
pendence of the /J's, 

lm+l@m+l “f" * * * “f“ trzfir* 

“ " k\y\ ... k fn,y ff% k i^rj 

would be a vector 9*0 in both S 2 and Si and therefore in 
SiC\S 2 . It would then equal a linear combination of 
yit 72 , • • • , 7m. This would imply a linear relation 
among the basic vectors of S 2t which is impossible. Thus 
every /,=0. Now a relation 

£l7l + * # * + k m ym + km+Wm+l + ' ' ' + £ri<*n = 0 

would imply a dependence among the basic vectors of 
S\ unless every k x =0. Thus the vectors (27) are linearly 
independent, and the rank of S^JS 2 is ri-\-r 2 — m. 

Let us denote the rank (dimension) of the linear sub- 
space Si by r(Si). We have proved the following theo- 
rem: 

Theorem 50. r(S,)+r(S 2 ) = r(SinS 2 )+r(S,US 2 ). 

The concepts of union and intersection apply to any 
number of subspaces 

S 1, S 2 , • • • , S* 

of S. The union , which may be written in either of the 
notations 


5iU5 2 U • • • US*, (Si, S 2 , • • • , Sk) 
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is composed of all vectors of 5 which can be written as a 
sum 

+ c*2 + • • • + Otk 

where a, ranges over all vectors of S x . The intersection 

Si n s t n • • • n s k = [5,, s,, • • • , s k \ 

is composed of all vectors which are common to all the 
subspaces Si, S 2 , • * • , S*. Both the union and the inter- 
section are linear subspaces of S t and both operations W 
and Pi are associative and commutative. 

Theorem 51. Denote by SI the orthogonal comple- 
ment of S x , and let 

D = (Si, St,-- - , Sk),' M = [Si, Si,-- - , S*J, 

D' = (S{ ,S{,--- ,Si), M' = [S{ ,Si, - - - ,Si]. 

Then M f is the orthogonal complement of D, and M is 
the orthogonal complement of D\ 

Let 0 be any vector of the orthogonal complement of 
D . Since every vector a, of S t is in D , it is true that 
a, 0=0 so that 0 is in S! for every i t and consequently 
is in M\ 

Now let 0 be any vector of M\ and let 

a =* fli + «2 + * 1 1 + 

be any vector of D , where ot x is in S x . Since 0 is in every 
SI, 

o*0 = ai*0 + a 2 -0 + • * • + Ofc'0 = 0. 

Thus 0 is in the orthogonal complement of D . Conse- 
quently M ' is equal to the orthogonal complement of D. 

Since S x is the orthogonal complement of SI, we 
apply the above result to the spaces S{ 9 S{ 9 • • ■ , Si 
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and perceive that M is equal to the orthogonal comple- 
ment of D '. 

36. Divisors and multiples. As we have seen, every 
space of w-th order vectors can be spanned by the row 
vectors of an nXn matrix. If Si is spanned by fewer 
than n vectors, 0-vectors may be adjoined so as to bring 
this number up to n. Thus it will be no real restriction, 
and a great convenience, to assume that all matrices are 
square with n rows and columns. This will be assumed 
in the remainder of this chapter. 

If three matrices A, C, D exist with elements in a 
field F such that 


A = CD , 

then D is called a right divisor , and C a left divisor , of A. 
Also A is called a left multiple of D and a right multiple 
of C. 

If C is non-singular, it is both a left and a right divi- 
sor of every matrix A. For then C~ 1 exists, and we may 
define D and D x as the products 

D = C~ l A , D x = AC-\ 

Thus A = CD and A =D X C. 

But if C or D is singular, A is of the same or lower 
rank than either, so that A is clearly not arbitrary. If C 
is non-singular, A is called a left associate of D. Since 
C~ l A =Z>, D is also a left associate of A. If JD is non- 
singular, A and C are right associates . If C is singular, A 
is a proper left multiple of D , and if D is singular, A is a 
proper right multiple of C. 

If A = 0, A is both a left and a right multiple of every 
matrix, for 


0 = OD = CO 
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for every C and D . But if A is not zero, C and D are each 
of the same or higher rank than A, so that they are not 
arbitrary. If A = 0, C and D are divisors of zero (§26). If 
neither C nor D is 0, they are both proper divisors of 
zero. As we have seen (Theorem 46), a matrix is a 
proper divisor of zero if and only if it is singular. 

If D is a right divisor of two matrices A and B, and if 
D is a left multiple of every common right divisor of A 
and B, then D is called a greatest common right divisor 
(g.c.r.d.) of A and B . A common left multiple of two 
matrices A and B is called a least common left multiple 
(l.c.l.m.) of A and B if it is a right divisor of every com- 
mon left multiple of A and B . 

Similar definitions hold for greatest common left divi- 
sor and least common right multiple. Their properties are 
evidently similar to those of the g.c.r.d. and l.c.l.m. 

If D is a greatest common right divisor of A and B , 
so is PD for every non-singular matrix P. For if 

A = HD, B = KD, 

then 

A = HP~ l PD, B = KP~ l PD, 

so that PD is a common right divisor of A and B . If Di 
is any common right divisor of A and B , then D*=LD X by 
definition of g.c.r.d. Hence 

PD = PLDi, 

so that Di is a right divisor of PD, which therefore is a 
greatest common right divisor of A and B . 

Similarly if M is a least common left multiple of A 
and B, so is PM for every non-singular matrix P. For if 

M * HA = KB, 
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then 

PM = PH A = PAP 

so that PAf is a common left multiple of A and B. If Mi 
is any common left multiple of A and B , 

Mi = LM = LP~ l PM, 

so that PAf is a least common left multiple of i4 and B. 
We shall now prove 

Theorem 52. Every pair of nXn matrices A and B 
have a greatest common right divisor D expressible in the 
form 

D = PA + QB. 

Consider the matrix 



of order 2 n. By Theorem 18 there exists a non-singular 
matrix X of order In such that XG consists entirely of 
O’s except perhaps for the elements of the nXn block in 
the upper left corner. That is, 

rx n x l2 i VA 01 _ r d oi 
LX 2l X 22 ] LB OJ 10 oi 

where the submatrices Xu are obtained by cutting X up 
into nXn blocks. These blocks multiply like the ele- 
ments of a matrix (see §10) so that 

(28) X n A + X 12 B = D. 

Now X is non-singular, so that it has an inverse Y 
which can likewise be cut into blocks. We have 
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TA 01 _ rV n Y l2 l I ~D 01 

LB Oj L Yn F 22 J L 0 01 

Thus 

A = YuD, B = F 2 iZ>. 

These equations tell us that D is a common right divisor 
of A and B, and (28) shows that every common right 
divisor of A and B is a right divisor of D . Thus D is a 
g.c.r.d. of A and B . 

The process is an extremely practicable one. Con- 
sider the matrices 



“ 2 

1 

4 

2“ 


-- 1 

3 

4 

0 “ 


- 1 

1 

S 

3 

9 

0 

- 2 

1 

3 

A = 

3 

3 

13 

7 

, B = 

2 

4 

18 

10 


_ 1 

2 

9 

5_ 


_ 0 

0 

0 

0 _ 


We place these eight row vectors in one column and 
apply elementary transformations to them until as 
many as possible are made 0, and the rest are linearly in- 
dependent. In the present example five can be made 0 
and the remaining three can be taken to be 

(- 17,1,0,0), (8, 1,2,0), (-3,0, 1,1). 

Thus we have the g.c.r.d. 

~ 0 0 0 0 - 
- 17 1 0 0 

D = 

8 12 0 
.- 3011 . 

It should be noted that a g.c.r.d. in Hermite form 
(§12) cap always be found. 
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37. Divisors and multiples of matric polynomials. If 
two matrices can be obtained from two polynomials 
fi(x) and fi(x) having scalar coefficients by substituting 
for x the same matrix A , a g.c.r.d. and a l.c.l.m. can be 
easily obtained. 

Let fi(x) and fo(x) be two polynomials with coeffi- 
cients in F, and let d(x) be their greatest common divi- 
sor. Then there exist polynomials h(x) t k{x) % p(x) and 
q(x) with coefficients in F such that 

fi(x) - h(x) d(x), Mx) = k(x)-d(x), 
d(x) = p(x)fi(x) + q(x)f 2 (x). 

Hence for every matrix A 

fi(A) = h(A)d(A) f f t (A) = k(A)-d(A), 
d(A) = p{A)-MA) + q(A)-MA). 

Thus d(A) is a g.c.r.d. of the matrices fi(A) and /2 (A). 

Now let m(x) be a least common multiple of the 
polynomials f\{x) and / 2 W. There exist polynomials 
A(jc), k(x ), p(x) and q{x) with coefficients in /^such that 

m(x) = h(x) fi(x) = k(x) f 2 (x), 
p(x)h(x) + q(x)-k(x) = 1. 

Then for every matrix A 

m(A) = h(A)fi(A) = k(A)f 2 (A) 

so that m(A) is a common left multiple of/i(i4) and 
MA). 

That m(A) is a least common left multiple of/i(-4) 
and f*(A) is not so immediate. We must show that for 
every matrix M such that 

M = Hf\{A) = K-MA) 
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there exists a matrix L such that M =L m(A). It is not 
necessarily true that II, K , M or L is obtainable by sub- 
stituting A for x in any polynomial. Now 

M h(A) = H-fi(A)- h(A) = H-m(A), 

M k{A ) = K f 2 (A) k(A) = A" w(,4), 

A(yl).^) + ^)-^) =/. 

Hence 

M = + M k(A)q(A) 

~ H-m(A) p(A) + K m(A) q(A) 

=» (// + K q(A)) tn(A). 

Thus w(i4) is a right divisor of every common left mul- 
tiple M y and we have proved 

Theorem S3. If d(x ) is a greatest common divisor 
and m{x) a least common multiple of the polynomials 
fi(x) and fi(x), then d(A) is a g.c.r.d. and m(A) is a 
hc.l.m. of the matrices fi(A) and fi(A). 

38. Relation of the union to the greatest common 
right divisor. It will be the purpose of this section to 
prove 

Theorem 54. Let A and B he nXn matrices whose 
row vectors span the respective linear spaces Si and S 2 - Let 
Dbea g.c.r.d., and let M be a l.c.l.m. of A and B. Then the 
row vectors of D span the union S\\JS 2 , and the row vectors 
of M span the intersection SiC\S 2 . Conversely every matrix 
whose row vectors span S\<JS 2 is a g.c.r.d. of A and B , and 
every matrix whose row vectors span S\C\S 2 is a l.c.l.m . of 
A and B. 

If D is a g.c.r.d. of A and B , there exist matrices 
H \ K , P, Q such that 

A = HD % B — KD, D - PA + QB. 
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Let the row vectors of D span the linear space Sz. Since 
A —IID, every row vector of A is a linear combination 
of the row vectors of D t so that S1CS3. Similarly S 2 CS 8 , 
so that Si^JSzQSz. Since D =PA+QB f every row vec- 
tor of D is equal to the sum of a linear combination of 
row vectors of A and a linear combination of row vectors 
of B. Thus S3CS1US2, and accordingly D spans the 
linear space SJJSi. 

If M is a l.c.l.m. of A and 5 , there exist matrices II 
and K such that 


M = IIA = KB. 

Let M span the space 5 4. Since every row vector of M 
is a linear combination of row vectors of A, and also a 
linear combination of row vectors of B, SiQSiCSSi. 
Now let 0 be any vector common to Si and S%. There 
exist vectors k and X such that 

0 = k-A = \ B. 

Let V, K, L denote the matrices whose respective first 
rows are the components of 0, x, X, and the rest of whose 
elements are 0. Then 

V « KA = LB 

so that V is a common left multiple of A and B. Since M 
is a l.c.l.m., there exists a matrix Hsuch that 

V = HM. 

Let 17 denote the first row vector of H. Then 
0 *= ri-M. 

That is, every vector which is common to Si and Si is 
equal to a linear combination of the row vectors of M. 
Hence SiftSiQSi , and M spans 5 i 05 j. 
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Now let D\ be any matrix which spans the union 
By Theorem 25 there exists a non-singular 
matrix P such that D\ =PZ>, so that D\ is also a g.c.r.d. 
of A and B (§36). Similarly if M\ is any matrix which 
spans the intersection Si OS*, there exists a non-singular 
matrix Q such that Mi = QM. Hence Mi is also a l.c.l.m. 
of A and B. 

Two properties of matrices are immediate from this 
theorem. 

Corollary 54a. If D is a g c.r.d. and M a l.c.l.m . of 
A and B , then 

r(A) + r(B) = r(M ) + r(D). 

This follows immediately from Theorem 50. 

Corollary 54b. If If and D\ are two greatest common 
right divisors of A and B } there exists a non-singular 
matrix P such that A = PD. If M and Mi are two least 
common left multiples of A and B, there exists a non- 
singular matrix Q such that Mi = QM. 

These results follow from Theorem 25. 

In §36 we had a practicable method for computing 
the g.c.r.d. of two matrices. With the aid of Theorem 51, 
this same method can be used to compute the l.c.l.m. 
From A and B we may compute their orthogonal com- 
plements A! and B r as in §34. Now we may find a g.c.r.d. 
D r of A 9 and B\ The orthogonal complement D of D' 
is a l.c.l.m. of A and B. 

39. The sum of vector spaces. Let m{x) be the mini- 
mum function of the nXn matrix A , and let 

m(x) * mi(x)-m 2 (x) • • • 

where the factors are relatively prime in pairs. In par- 
ticular we may assume that each m,-(x) is a power of an 
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irreducible polynomial. The least common multiple of 
these factors is their product m(x). Then, by Theorem 
53, w(i4)=0 is a least common left multiple of the 
matrices 

tni(A), m 2 (A ), • • • , m k (A). 

Let the row vectors of the matrices m x (A) span the 
respective linear spaces 5„ and let [Su 5a, • • • , 5 a] de- 
note their intersection. Since their l.c.l.m. is w(^4)=0, 
this intersection is 0, by Theorem 54. If 5/ is the or- 
thogonal complement of 5,, we know by Theorem 51 
that the union (5 / 1 Si t , Si!) is the orthogonal com- 
plement of [5i, 52, • • • , 5 a], and is therefore of rank n. 
Thus the basic vectors of 5/ , 52 ; , • • • , Si! together span 
the total space 5. 

Since m\{x) is relatively prime to the product 
W 2 (x) • • • the greatest common divisor of these 

two polynomials is 1. Then, by Theorem 53, / is a 
g.c.r.d. of the two matrices 

nti(A), m 2 (A) • • • m k (A). 

By Corollary 54a, 

r(mi(A)) + r(m 2 (A) • • • m k (A)) = r(m(A)) + r(I) = n, 

where r denotes rank. Similarly m 2 (x) is relatively prime 
to the product nt 8 (x) • • • m a(x), so that a least common 
multiple of these two polynomials is their product, and 

r(iwa(i4))+r(w 8 (i4) • • • m k (A)) = r(m 2 (A) • • • m k (A))+n. 

We continue until we reach the (>fe — l)th equation 

r(m*_i(4)) + r(m k (A)) = r(m k -i(A)'m k (A)) + n. 

Upon adding these k — 1 equations, we have 

r(mi(A)) + r(m 2 (A)) + • • • + r(m k (A)) = (k - 1 )». 
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Now 57 is the orthogonal complement of 5„ so that 
r(5/) = n - r(5,) = n - r(wi,(.l)). 

Hence 

r(5/ ) + r(52 ; ) + • • • + r(Si) = kn — (k — l)n = «. 
Denote r(5/) by r„ and let 

^tlt 0’i2t * * * » ^*r t 

be r» vectors which span 5/ . The vectors 

(29) <rii, • • * , <T ir j» C 21 t ’ • • * <T2r 2 t ' 9 m t <Ul» ’ ’ * » <Tkr k 

are n in number, since the sum of the ranks r t is n. These 
n vectors are linearly independent, since they span S 
which is of rank n. 

The union of subspaces is often called their sum. 
Thus 5 is the sum of the subspaces 5/, S{ , • • • , Si if 
the union (57, 5j , • • • , 5/) is of rank n. Then every 
vector 0 of 5 is expressible in the form 

0 = <h + 02 + • • • + 4>k 

where 0, is in 5/. If every vector 0 is uniquely expres- 
sible in this form, 5 is called the supplementary sum of 
the subspaces 57 , S% , ■ • • , Si • The uniqueness is 
equivalent to the linear independence of the basic vec- 
tors (29). For if 

0 = 01 + 02 + • ’ * + 01b = 01 + 02 + # ’ +0* 

where 0< and 0, are in Si and not every 0,=0„ we 
should have 

(0i — 00 + (02 — 02) + • • • + (0* — 00 = 0. 

This would yield a linear dependence relation among the 
basic vectors (29). Conversely, a dependence relation 
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among the basic vectors would mean that 0 could be 
represented in two ways as a sum of vectors 4>\+<l>% 
+ • • • + 0 *. 

We therefore have the important theorem, 

Theorem 55. Let the minimum function m(x) of the 
matrix A be written as a product of relatively prime factors 
mi(x) m 2 (x) • • • mk(x). Let the row vectors of m»(i4) span 
the space S t and let S l be the orthogonal complement of S x . 
Then the total space S is the supplementary sum of the 
subspaces SI , Si , • • • f SI , and their basic vectors (29) 
together span S . 

The following example will illustrate this theorem. 
Let 


- - 1 

5 

5 

3- 

5 

0 

- 17 

- 26 

- 10 

- 2 

32 

51 

_ 6 

2 

- 19 

- 31_ 


If Fis the rational field, the minimum function of A is 
m(x) = 

mi(x) = x 2 + x + 1, m*(x) = x 2 — x + 1, 


m\(A) = 


mt{A) = 


- 6 

- 4 

18 

32- 

14 

8 

- 42 

- 72 

- 24 

- 14 

72 

124 

14 

8 

- 42 

- 72_ 

- 4 

- 14 

8 

26“ 

4 

8 

- 8 

- 20 

- 4 

- 10 

8 

22 

2 

4 

-4 

- io_ 
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The row space Si of trii(A) is spanned by the vectors 
(- 1. 2, 3, 0), (0, - 1, 0, 2) 

and its orthogonal complement S{ by the vectors 
ern = (3, 0, 1, 0), <m = (0, 6, — 4, 3). 

The row space S2 of mi{A) is spanned by the vectors 
(- 1, 3, 2, 0), (0, - 1, 0, 1) 

and its orthogonal complement S{ by the vectors 
(721 = (2, 0, 1, 0), (r 22 = (0, 2, — 3, 2). 

The matrix 




■3. 

0 

1 

0 “ 

( 7 12 


0 

6 

- 4 

3 

(721 


2 

0 

1 

0 

-^ 22 - 


_0 

2 

- 3 

2 _ 


is non-singular so that its row vectors span 5. 

40. Annihilators of vectors. Let i4 be an «X» ma- 
trix whose minimum function is 

m(x) = mi(x)-ni 2 (x) • • • ntk(x) 

where the m t (x) are powers of distinct irreducible poly- 
nomials. That is, 

mi(x) - [/(*)]* 

where /(*) is irreducible. 

Since 

[W)] m - WHW)] 1 . 

it is surely true that 
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r[KA)Y * r[l(A)]»' 

where r denotes rank. Suppose that for some i these 
ranks were equal. Denote by S, the row space of [/(^4) ] * 
and by 5»+i the row space of [/(j 4)] 1+1 . Every vector of 
S,+i is in S t so that S.+iQS,*. If it were true that their 
ranks were the same, we should have S,+i = 5,-. Then by 
Theorem 25 there would exist a non-singular matrix M 
.such that 

[l(A)]<= M-[l(A )]*« t+ 1 g h. 
Upon multiplying both sides on the right by 
[l(A)Y~*-'m 2 (A) • • • vtk(A), 

we have 

[KA)] h - l tn 2 (A) • • • m k (A) = M >m(A) = 0, 

contradicting the fact that m(x) is the minimum func- 
tion of A . Thus 

•So D Sx D S 2 D • • O S h . 

If B 0=0, we shall call the matrix B an annihilator 
of the vector 0. All vectors annihilated by B constitute 
the null space of B . That is, the null space of B is the 
orthogonal complement of the row space of B. The di- 
nension of the null space of B is called the nullity of B . 
It is equal to the order of B minus its rank. 

Denote by SI the null space of [1(A)]*. Since 
r(S{) = n — r(5,)i we have 

Si C Si C Si C ••• CS£. 

We have proved 

Theorem 56. If [/(jc) ] h is the highest power of the ir- 
reducible polynomial l(x) which divides the minimum func- 
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tion of A , and if S( is the null space of [1(A) ] 'for 1 gi A, 

there exists a vector which is in SI but not in x . 

Let l(x) be of degree j, f(x) of degree <ij, and 
d(x) be the g.c.d. of f(x) and [/(jc) ]*. Then there exist 
polynomials t(x) and u(x) such that 

t(x)f(x) + u(x)-[l(x)]' = d(x). 

Let 0 be a vector which is in the null space S! of [/(^4) ] € 
but not in the null space 5/_i of [/(i4)] 4 “ l . Then 
KA) f{A) 4> + u(A)[l(A)]'<t> - d(A)4>. 
If/C4)*0 = O, then d(A) -0 = 0; but d(x) is of degree <ij 
and is of the form [£(*)]*, k<i, so that 0 would be in 
S/_i, contrary to its selection. Hence f(A) -0?*O, and we 
have proved 

Corollary 56. If <pts a vector of the null space 5/ 
of [1(A) ] l which is not in the null space of [1(A) J*” 1 , 
then there exists no polynomial f(x) of degree less than the 
degree of [/(*)]* such that f(A) 0 = 0. 

Let 0 be in S{ but not in and let 

f(x) = Co + CiX + • • • + Cj-iX'- 1 

be any polynomial with coefficients in Fof degree — 1 
where j is the degree of l(x ). Then 

[l(A)]<f(A)-<l> = f(A)-[l(A)]<-<t> = 0 

so that the vector f(A ) ■ <j> is in SI. Since f{x) is of lower 
degree than the degree of the irreducible polynomial 
/(*), it is prime to [/(*)]*. Hence there exist polynom- 
ials t(x) and u(x) such that 

*(*)•/(*) + «(*)• [*(*)]* = 1, 
t(A) /(A) + u(A). [/(^)]‘ = I, 
and consequently 


t(A) 9 f(A)*4> — 0 . 
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If it were true that f(A) *0 were in S/__ lf we should have 
[KA)]“-f(A)-4 = 0, 

and as a result, 

t(A)-[l(A)]^f(A)-<t> - [1(A) l 1 " 1 -* = 0 

so that <f> would be in Si- it contrary to its selection. We 
have therefore proved all but the last statement of 

Theorem 57. If l(x), as defined in Theorem 56 , is of 
degree j, and if <f> is a vector of S! which is not in lf 
the same is true of every vector of the form 

c&t> "b c\A<f> + c 2 A 2 <i> + • • • -j- C]~\A J ~~ l <l> 

unless co=ci= • • • j=0. The vector 1(A) <f> is in 

S't-i but not in S{- 2 - 

To prove the last statement, set 

\j/ = 1(A) 

If </> is in 5/ , then clearly 

feW]*- 1 + = [ HA)]* = 0 

so that is in If yfr were in SJ_*, we should have 

[l(A)]“ + = - 0 

so that 0 would be in 5U_|, contrary to its selection. 

Corollary 57. The null space SI of [/(-d)] 1 is of 
dimension d&ij where j is the degree of l(x). If A is non- 
derogatory with the minimum function [/(*)]*, then di 
=ij. 

By Theorem 57 the space SI contains at least j vec- 
tors 


A<t>, A*4>, • • • , A*- 1 # 
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which are in S! but not in and no linear combina- 
tion 7*0 of these vectors is in S/_ j. Hence they are 
linearly independent, and the din ension d x of S' is 
greater than the dimension d,_i of i by at least j. 
Moreover, S{ is of dimension at least j so th*it 

d t ^ rf-i + i ^ + 2/ ^ ■ 

S di + (i - 1)7 i + (i - 1)7 = V. 

If n=hj and [/(«/! )] fc =0, then cfo = A/\ and for i—h the 
first and last members of the above inequality are both 
hj. Thus every “ ” must be an so that d x =ij. 

We shall conclude with the following theorem. 

Theorem 58. If m{x) is the minimum function of a 
matrix A , there exists a veator which is in the null space of 
no matrix w 0 (-4) where Wo(tf) is a proper divisor of m(x). 

Let us set 

m t (x) = [fi(*)]*» (t = 1, 2, • • • , k ), 

where the l x (x) are distinct irreducible polynomials. Let 
<p x be a vector which is in the null space of [/.G4)]*‘ but 
not in the null space of any lower power of l x (A ). By 
Theorem 56 such a vector exists. Define 

<f> = 01 + <t>2 + * ’ ’ + 

Let mo(z) be any proper divisor of m{x). Then 
mo(A)-4 = nio(A)-<t>i + Wo(i4)*<fo + • • • + WoCd)-0*. 
Since 

m t (A)mo(A)^i = m Q (A)m t (A)-<l>i = 0, 

wi o (i4) -0, is in the null space of m t (A). Then m 0 (A) </> 
will be 0 if and only if m 0 (A) <p x =0 for every value of i, 
by Theorem 55. 
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If nio(x) is a proper divisor of m(x), some one of the 
irreducible factors, say /,(*), occurs in m 0 (x) to a lower 
power than it occurs in tn(x). Let 

m 0 (x) = «(*)• [/<(*)]' /<*< 

where g(x ) is relatively prime to l,(x). Then a greatest 
common divisor of mo(x) and [!,(*) ]*• is [l t (x) \ l , so that 
polynomials t(x) and u(x) exist such that 

t(x)m 0 (x) + «(*)• [/,(*)]*< = [!<(*) Y‘ 

Hence 

If mo(A) <t>, were 0, it would be true that [/.(.d)]' <f>, 
= 0, contrary to the selection of </>,. 
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THE RATIONAL CANONICAL FORM 

41 . Similar matrices. Two square matrices A and A ' 
are called similar if there exists a non-singular matrix P 
such that 

A' = P~ l AP. 

We may use the notation A' LA to denote similarity. 

Since A' = P~ l AP implies A =PA f P~ x , and there- 
fore A = it is clear that A'LA implies 

ALA'. Thus similarity is symmetric. Since A =I^ l AI, 
similarity is reflexive. Moreover, let 

A' = P~ l AP, A" = Q-'A'Q 
where both P and Q are, of course, non-singular. Then 
A" = Q-\P~ X AP)Q = (PQ)-'A(PQ). 

Hence A 1 LA and A" LA' imply A n LA so that simi- 
larity is transitive. Thus similarity of matrices is a type 
of equals relation, or equivalence relation if you prefer. 
It is quite to be expected that such a relation will have 
important properties. 

The concept of similarity is, in fact, one of the cen- 
tral ideas in matric theory. It appears in the study of 
groups of linear transformations, matric representations 
of algebras, pencils of quadratic forms, and in connec- 
tion with various problems in projective geometry (see 
§56), and in other connections. 

If A' = P~'AP, B' = P~ l BP, 

then for every pair of numbers k and l of F, 
kA' + IB ' = P~'(kA + IB)P 9 A'B' = P~ l (AB)P. 
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Since every polynomial in A is built from A by the 
operations of addition, multiplication and scalar multi- 
plication, it follows that for every polynomial /(*) with 
coefficients in F, 

(30) P-'f(A)P = /04') = /(JP^ 1 AP). 

Theorem 59. Similar matrices have the same char- 
acteristic and minimum polynomials , the same determi- 
nant , norm and rank . 

Let A' =P~ l AP. Then 

P“'(xl - A)P = xl - P~ l AP = xl - A' 

so that 

d(xl - A 9 ) = d(P-') d(xI - A)-d(P). 

Since d(P~ l ) =\/d(P), the characteristic polynomials 
are equal. 

Let A and A f have the respective minimum poly- 
nomials m(x) and m'(x). Since 

m{A‘) = P~ l -m(A)- P, 

and since m(A) =0, m(A f ) is also 0. Then by Theorem 
40, m'(x) divides m(x). Now A — PA'P* 1 so that the 
r61es of A and A' are interchangeable. Hence m{x) like- 
wise divides m'(x) and, as each has 1 as its leading co- 
efficient, they are equal. 

Since the determinant of A is equal to the constant 
term of the characteristic function multiplied by ( — l) n 
where n is the order of A t it is true that d(A) *=d(i4'). 
Similarly the norm of A is equal to the constant term of 
the minimum function multiplied by ( — 1) M where /u is 
the index of A , so that n(A) =n(A'). Since P and 
are non-singular, the ranks of A and A 9 are equal by 
Theorem 17. 
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42. The direct sum. Let A and B be two square ma- 
trices, A of order r and B of order s. The matrix 



of order r+s is called their direct sum. 

The following properties of the direct sum are ele- 
mentary, and follow directly from the definition. 

(a) . {A+B)+C=A+(B-\-C). 

(b) . k{A 4-2?) =kA -\-kB where k is in F. 

(c) . (i4i+^4s)4‘( j Bi‘b-®2 ) == (^4i4-5i) + (^444 '5 j). 

(d) . 04i4-2?i) {A*+Bi) =A\Ai-\-B x Bi. 

(e) . (A-\-B) T =A T +B T . 

(f) . (/1 4 -B)~ l —A~ l +B~ l if A and Bare non-singular. 

(g) . d(A+B)=d{A) d{B). 

(h) . r(A+B)=r{A)+r{B). 

(i) . A+B±B+A. 

It is understood that Ai and Ai are of the same order, 
and also that B\ and B t are of the same order. It is not 
necessary that Ai and 2?i be of the same order. 

We shall derive two further properties, namely 

(j) . If A' ± A and B'±B, then A'-\-B'±A+B. 

(k) . If f(x) is any polynomial, 

f(A + B) = f(A) + f(B). 

In order to prove (j), let 

A' = P~ l AP, B' = Q~ l BQ. 

Then by (d) and (f), 

A' 4- B' - (P- 1 4- Q~ l )(A + B)(P 4- (?) 

= (P 4- Q)-'(A 4- B)(P 4- Q). 
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To prove (k), let 

/(*) = 2 «<**• 

i = 0 

By (d), 

( A + B) i = 4*4- 

and by (b) 

c x (A 4“ B) 1 = CiA 4 C{B *. 


The result now follows from (c). 

More generally, let us suppose that the nXn matrix 
A of rank r can be written 


A = 4- ^2 4- ••• + A k 


pu o ...0 
0 A a ---0 


Lo 0 • • • A k J 


where A i is of order n x and rank r x . Let the row vectors of 
A span the space S , let the first n i row vectors of A span 
Si t the next n 2 row vectors of A span S% t etc. Denote by 


0’<2» * * * i &ir{ 

a set of basic vectors of 5,. By (h) the rank of S is n+r* 
+ • • • +fk — r. From the form of the matrix A it is clear 
that the vectors 


<Tll, * * * i (Tin* <r 21, * • ■ » Vlrv * * * i ‘ ) **r k 

are linearly independent and span S. Thus 5 is the sup- 
plementary sum 

Si + Si + • • • + Sk 
of the subspaces S». (See §39.) 
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It is moreover evident that, if <t>% is any vector of 5„ 
and if i &j, 

= 0 . 

A supplementary sum of subspaces which has the addi- 
tional property that every inner product of two vectors 
of different subspaces is 0 is called the direct sum of 
these subspaces, and we write 

5 = 5i + S 2 + ••• + 5*. 

It is clear that if the matrix A is a direct sum of sub- 
matrices, its row space S is a direct sum of corresponding 
subspaces. 

43. Invariant spaces. A subspace 5 0 of 5 is said to be 
an invariant space of the matrix A if, for every vector 
0 of So, it is true that A -0 is in S 0 . If 0 is any vector of 
an invariant space So of A, then clearly f(A) *0 is in S 0 
for every polynomial f(x). 

In particular the null space of A is an invariant space 
of A , but A has invariant spaces other than its null 
space. 

If Si and 52 are invariant spaces of A t then clearly 
5 iU 52 and 5 iP^ 52 are also invariant spaces of A . 

Theorem 60. If A and B are commutative matrices , 
the null space of either is an invariant space of both . 

Let 0 be any vector of the null space of A . Then 
4*0=0, and 

AB4> = BA-4> = 0 


so that B <t> is also in the null space of A . Thus the null 
space of A is an invariant space of B. 
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Corollary 60. If f(x) and g(x) are any two poly- 
nomials, the null space of f(A) is an invariant space of 

g(A). 

Theorem 61. If f(A) <p and g(A)<f> are two vectors of 
the same invariant space So of A, and if d(x) is a greatest 
common divisor of f(x) and g(x), then d(A) -<p is in So . 

The greatest common divisor d(x) can be written in 
the form 

d(x) = p(x)f(x) + q(x)g(x). 

Then 

d(A) </> = p(A)f(A) + + q(A)-g(A)-+. 

Since f(A) d> and g(A) <p are in S 0 , so are p(A) f(A)-<l> 
and q(A) g(A) <l>, and so is their sum. 

Corollary 61a. If f(A) <l> and g(A)-<f> are two vec- 
tors of the same invariant space So of A, and if there exists 
no polynomial h(x) of degree less than the degree of f(x) 
such that h(A) <l> is in So, then f(x) divides g{x). 

For unless f(x) were a divisor of g(x), they would 
have a greatest common divisor d{x) of degree less than 
the degree of f(x), whence by the theorem d(A)-<t> is in 
S„. 

Corollary 61b. If f(x) is a polynomial of lowest de- 
gree such that f (A)- 4> is in an invariant space So of A, then 
f(x) divides the minimum function m(x) of A. 

Take g(x) =m(x) in Corollary 61a. 

Much of the importance of invariant spaces derives 
from the following result. 

Lemma 62. Let the total vector space S be the supple- 
mentary sum of the subspaces 

Si, St, • • • , S* 
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where each S x is of dimension r t and has the basis cr,i, 
, <r ir| . Let P be the matrix whose column vectors 
are 

(31) (Til, • • • , <J\ <721, * * ' y * * * » & kh > 

If each space S % is an invariant space of the matrix A, then 

P-'AP = B, + # 2 + • • • + B k 

where B t is a matrix of order r x . 

Since the vectors (31) span 5 (see §39), P is non- 
singular. Now form the product 

AP = fell, • • * , jSlrp 021, • • , 02 r 2 , * * * , 0*1, * * * , 0*r*] 
- Af, 

the 0*s being the column vectors of M. Since each 5, is an 
invariant space of A, the vectors 


0tl» 0»2, » 0*r; 

are linear combinations of the vectors 


<r t i, <7,2, • • • , <7, r , 

only, and do not involve the other <r’s. Thus if fi = 2, 
0n = 6111^11 + 6ii2<ri2, 012 = bn \( T \\ + 6122^12 


so that 
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In general M can be written 


kii»* 


' » G’lrii &21i m 'i * ‘ i^U»' 




~i*i o •••on 

O /* 2 - • • o 


Lo O • • • B k J 


where each submatrix B t has r, rows and columns. 
If we denote the last matrix by B } we have 

AP = PB 


or, since P is non-singular, 

p-iAP = B = B t + B 2 + • • • + B k . 

We are now prepared to prove an important theorem 
in the theory of similarity. 

Theorem 62.- Let A be any nXn matrix with ele- 
ments in a field F , and let 

m{x) = • • • m k (x) 

be its minimum function expressed as a product of poly- 
nomials which are relatively prime in pairs . Let the null 
space of mi(A) be of rank r,*. Then A is similar to a direct 
sum 

B\ + B2 4 * • • • 4 * Bk 

where Bi is of order r,-, and the minimum function of Bi is 
m t (x). 

If S{ is the null space of m t (A), we know from Theo- 
rem 55 that the total space 5 is the supplementary sum 
of the subspaces 

,S£. 

By Corollary 60 each SI is an invariant subspace of A . 
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Then by Lemma 62 there exists a non-singular matrix P 
such that 

P~ l AP = B l + P 2 + • • • 4 B k 
where Bi is of order r t *. 

Furthermore, by Corollary 60 each space 5/ is an 
invariant subspace of the matrix m x (A) for every i and j. 
Then by Lemma 62 

P~ l *w,(i4)*P = Ci + C2 + • • • 4 C* 

where Cy is of order r jt and the matrix P is the same as 
above. Since SI is the null space of m t (A), m x (A)cr x] = 0 
so that Ci is a zero matrix of r x rows and columns. But 
by (30) of §41, 

P- l m t (A)-P = ro,(p-L4P) 

= m t (Bi 4 B 2 4* • • • + P*). 

By (k) of §42 this is equal to 

niiiBi) + m t (B 2 ) + • • • + m t (B k ). 

Consequently Cy = w,(By), and in particular m t (Bi) — 0. 
It then follows from Theorem 40 that m t (x) is divisible 
by the minimum function ml (x) of B t . 

Since A and B are similar, they have the same mini- 
mum function m{x) by Theorem 59. Define m'(x) as the 
product 

tn'(x) = m( (x)-m{ (x) • • • ml ( x ). 

Then m(x) is divisible by m'(x). By Property (k) of §42, 

m'(B) = w'(Pi) 4 m^Bl) 4 • • • 4 
= m{ (Pi) •ml (Pi) • • • ml (Pi) 

4 m{ (B^'ml (P 2 ) • • • ml (P 2 ) 

H 4 m{ (B k ) • ml (B k ) - ml (B k ) 
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Since ml (B t ) =0, it is true that m f (B)=0 so that m'(x) 
is divisible by the minimum function m(x) of B. Since 
m(x) and m'(x) each have leading coefficient 1, they are 
equal. Now if it were true that ml (x) were a proper 
divisor of m x {x) for some value of i, it would follow that 
m'(x) would be a proper divisor of m{x ), which is not 
true. Consequently for every value of i, m t (x) is the 
minimum function of B x . 

We can illustrate this theorem by continuing with 
the example of §39. We have 



- -1 S 5 

3“ 


-3 0 

2 0“ 


5 0 -17 

-26 


0 6 

0 2 

/ 1 = 



, p= 




-10 -2 32 

51 


1 -4 

1 -3 


_ 6 2-19 

— 31_ 


_0 3 

0 2_ 

AP = A [(Til, 0 - 12 , cr 21 , (T 22 ] 





-|^12» fo , 21 “ 70 ’ 22 » i^21"“^<T22j 


1 

" f 

¥ 

0 0 - 


~ fo" lit <Tl2» 0"21» 0 - 22 ] 

— 3 ■ 

-* 

0 0 

= PB. 



0 

0 

f i 



1 

_ 0 

0 - 

i -h- 


The minimum functions of the matrices 


*-r f ¥ ' 


B t = 

* 

‘I 


L— i 

1 


L-i 

— £J 


are, respectively, 

mi(x) = x 2 + x + 1, m i(x) = x 2 — x + 1. 
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44. The non-derogatory case. Let the matrix A be 
non-derogatory of order and index tt, and let its mini- 
mum function be 

m(x) = wo + »»i* + • • • + w b-i*" -1 + 

Let ^ be a vector which is not annihilated by any matrix 
mo(A) where m 0 (x) is a proper divisor of m(x). By Theo- 
rem 58 such a vector exists. Consider the n vectors 

(32) 4,H, A*4, ••• ,A”~'<i>. 

If these vectors were linearly dependent, there would 
exist a relation 

Co<t> + c t A<f> fiA 2 <j> + • • • + = 0 

which could be written 

f(A) 4 = 0, /(*) = Co + ci* + • • • + Cn-,*- 1 0. 

Let d(x) be the greatest common divisor of f(x) and 
m(x). Since /(x) is of degree <n, d(x)?*m{x). Then 
there exist polynomials p(x) and q(x) such that 

p(x)-f(x) + q(x)-m(x) = d(x) 

and consequently 

p(A)-f(A) -4 + q(A)-nt(A)'4 — d(A)-4 

so that d(A) 4—0. But d(x) is a proper divisor of m(x) 
so that 4 cannot be annihilated by d(A). Hence the 
vectors (32) are linearly independent. 

Denote by P the matrix whose column vectors are 
given by (32). Since these vectors are linearly inde- 
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pendent, P is non-singular. Now form the product 
AP=A [0, A</>, A 2 <t>, • • • , A 

= [A<t>, A 2 (j> t • • • , — mo(f) — m\A<f) — • • •— w n -i^ w ”V] 


= [</>, A<l>, 


'0 0 • • • 0 -wo ~\ 

1 0 • • • 0 —nti 
0 1 • • • 0 — fit 2 

Lo 0 • • ■ 1 


= PC. 


Thus P~ l AP = C where C is the companion matrix 
(§31) of the minimum equation of A , and we have 
proved 

Theorem 63. Every non-derogatory matrix is similar 
to the companion matrix of its minimum equation . 

Now suppose that we write 

m(x) = mt(x)-mt(x) • • • m k (x) 

where the m t (x) are relatively prime in pairs, and the 
sum of their degrees is w. By Theorem 62 we know that 
there exists a non-singular matrix P such that 

P-'AP = B 1 + B 2 + • • • + B k = B 

where m t (x) is the minimum function of We know 
that the degree of the minimum function is less than or 
equal to the order of the matrix, and that the sum of the 
orders of the B t is ». If for some i the degree of m t (x) 
were less than the order of B„ the degree of m(x) would 
be less than w, which is not true. Hence every B t is non- 
derogatory. 
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By Theorem 63 there exists a non-singular matrix Q x 
such that 

<?. = Ci 

where C% is the companion matrix of the equation 
m t (x) = 0. Then as in the proof of Property (k) of §42, 

Q~ l BQ = C x + Co + • • • + C k , 

where 

Q = (?i + (?2 + • • • 

Consequently 

Qr l P~ x A PQ = Cl + C 2 + • • • + Ck 


and we have 

Theorem 64. If the matrix A is non-derogatory , and 
if its minimum function is 

m{x) = m\(x)-M2(x) • • • Mk(x) 

where the factors are relatively prime in pairs , then A is 
similar to the direct sum of the companion matrices of the 
factors mi(x). 

Let us return to the example used in §43. Let us 
choose 0 = (1, 0, 0, 0). Then 


P= [<(>, A*, A*4>, A**} 
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We find that 


P-'AP = 


"0 0 

1 0 

0 1 

_0 0 


0 - r 

0 0 

0 - 1 

1 o_ 


which is the companion matrix of the minimum equa- 
tion 


«/(.*) = a: 4 + x 1 + 1 = 0. 

The second canonical form can be obtained from the 
matrices Bi and Bt of §43. Let 


Q i = 

r. n 
Lo -§J 

» (?2 = 

[1-3 

Then 




qi 1 b 1 q 1 = 

-o - r 

_i - l. 

, Qi = 

C“3 


are the companion matrices of the equations obtained 
from the respective factors .v 2 +.r+l and ** — *+l of the 
minimum function of A. Then A is similar to the matrix 

"0-1 0 0" 

1-1 0 0 

0 0 0 - 1 

_0 0 1 1_ 

45. A canonical form. As we saw in Theorem 62, 

every matrix with elements in F is similar to a direct 
sum of matrices whose minimum functions are powers of 
irreducible polynomials. We shall restrict attention, 
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then, to an »X» matrix A of index n whose minimum 
function is 

m{x) = [l(*)]» n = jh, 

l(x) = /o 4* l\x + hx 2 + • • • + \x*~ l + x ; . 

Let 0 be a vector which is in the null space of 
[J(i4)] A and not in the null space of [l(A)] h ~ l . Such a 
vector exists by Theorem 56. Denote [1(A)]' <l> by 0,-, 
and form the set of vectors 

0, A#, j4 2 0, • • • , A J ~ l <t > , 0i, i40i, 4 2 0i» • * * » 

( 33 ) 

• • • , 0^ — i, A<l>h-\ , i4 2 0/i-i, • • • , J 0fc-i. 


These vectors are jfc = /z in number, and we shall prove 
that they are linearly independent. If there were a linear 
relation among these vectors, it would take the form 

f(A)<t> = 0, f(x) = c 0 + ci* + c 2 x 2 + • • • + c M _ i**- 1 . 

But by Corollary 56 this is impossible. 

It may be well to note at this point that the vectors 

(34) 0, 40, ,4 2 0, ... 

are also linearly independent, and span the same space 
as is spanned by the vectors (33). 

Let us now consider the non-derogatory case in 
which /* = «. The vectors (33) are now n in number and 
linearly independent so that they span 5. Let P be the 
matrix whose column vectors are the vectors (33). Form 
the product AP. 

To relieve the notation, let us assume 
h - 3, j « 3, l(x) « h + hx + hx 2 + x\ 
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Then 

A*(f> =l(A)-<t> - l 0 <t> — l\A<f> -M 2 0, 

j4Vi = 1(A) $ i — l 0 (f > i — — /2‘^Vi» 

./1 3 02 == — /()02 — Zl/l02 — 1 2 A 2 02« 

AP = |/40, ^V, — lo<t> — / 1-/1 0 — I2A 2 0 -j- <t> 1 , 

A<t> i, /lVi, — Mi “■ M 01 — MVi + 02 , 

^402, A 2 02» — /o02 — l\A(f>2 — ^2*1 V 2 ] 

— [0, ^0, A 2 <f> y 0 1 , /l0i, i4Vi, </>2, * / l02» >1 2 02]^ — PB 


where 



It is clear that 

s 0 Z 0 3 

d(xl — 5) = — 1 3 Zi 

0 — 1 x I 2 

As we saw in §31, the displayed determinant is the char- 
acteristic determinant of the companion matrix of 
/(*) = (), so that 


d(xl - B) - [l(x) ]». 
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Now xI — B has an (« — l)-rowed minor determinant in 
the lower left-hand corner which is equal to ± 1, so that 
by Theorem 41 the minimum function of B is [/(*)] 3 . 

In general the matrix of type (35) whose minimum 
function is [/(#)]* will have along the principal diagonal 
h blocks, each of which is the companion matrix of 
/(*)= 0. The diagonal just below the principal diagonal 
consists entirely of l's. We have 

Theorem 65. Every non-derogatory matrix whose 
minimum function is [/(*)]* where l(x) is irreducible of 
degree j is similar to a matrix of the type (35) with h blocks 
along the diagonal. 

The case where F is the complex field, or in fact any 
algebraically closed ^eld, deserves special mention. In 
this case we can write 

m(x) = (x — xi) h '(x — xt)** • • • (s — Xk) hk 

so that A is similar to a direct sum of matrices of the 
form 

< 0 0 • • • 0 
x% 0 • • • 0 
1 £<•••() 

0 1 • . • 0 


0 0 • • • 1 

of hi rows and columns. This is the familiar Jordan nor- 
mal form of a matrix with complex elements. 

46. The derogatory case. If A has the minimum 
function 

«•(*) = [/(*)]* jh - n < n, 
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the vectors (33) will span an invariant subspace Si of S 9 
but will not span S. There will exist, then, vectors which 
are linearly independent of the vectors (33). For each 
such vector 

[l(A)\ h ‘+' = 0 


which is surely in Si, while 

[^)]° r - 1 r =•*' 

is not in Si. For each such vector therefore, there 
exists an integer fe, 0 <k^h, such that [l(A)] k \l/ r is in 
Si while [l(A)] k ~ l is not in S x . 

Of all such vectors choose one whose correspond- 
ing integer k is maximal, and call it There will exist 
a relation 

[l(A)\ k r = g(A) 4> l^k^h. 

By the division algorithm we can determine polynomials 
q{x) and r(x), the latter either zero or of degree less than 
jk, such that 

g(x) = [!(*)]*•?(*) + r(x). 


Then 

[l{A)Y-r = [KA)} k q(A)-4> + r{A)-4>. 
Multiplying on the left by [/(.4)]* - *, we have 

= m(A)q(A)-4> + [l(A)]>~*r(A)<f, 


so that 

The polynomial 


[l(A)]»-*-r(A)-t 


0 . 


[l(x)] h - k -r(x) 

is 0 or of degree less than j(h—k)+jk=n, which is im 
possible by Corollary 56. Hence the polynomial is iden^ 
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tically zero, and r(x) =0. Then there exists a relation of 
the form 

l mY-r = [KA)Y-q{A)<t> 1 Zk£h. 

We shall now define a new vector 
\p = yp" — q(A) <j>. 

It is clear that 

[l(A)\ k * = 0, 

which is in Si. Consider the equation 

Since Si is an invariant subspace, the last term is in Si. 
If the left member were also in Si, then 

* [1{A) 

would be in Si, contrary to the definition of k. Thus \p is 
a vector whose corresponding integer k is maximal, and 
by Corollary 61b [/(*)]* is a polynomial p{x) of smallest 
degree such that p(A) \p is in Si. 

We shall now prove that the vectors 

j if Mf Aty, • • • , A*- l \p, )p u A\pu Ahp u • • • , A*-hh, 
• • • , \p /c — i , Ayp k -u A^k-u • • • , A’-hp k -i 

where i % = [/(j4)]* i are linearly independent and form 
with the vectors (33) a linearly independent set, and 
span an invariant subspacc S 2 of 5. 

If the vectors (37) were linearly dependent among 
themselves, or if they were linearly dependent upon the 
vectors (33), there would exist a relation 

f(A)\p = g(A)<p 

with f(x) of degree <jk and g(x) either 0 or of degree 
<jh. By Corollary 61a, f(x) is divisible by [/(*)]*, which 
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is of degree jk . Hence f(x) = 0 and no linear relation 
among the vectors (33) and (37) can exist. 

Since [l(A)] k the product by A of each vector 

of (37) is equal to a linear combination of these vectors. 
Thus the space S 2 spanned by the vectors (37) is invari- 
ant under A and is of rank (or dimension) jk. 

47. Continuation of the derogatory case. We shall 
carry the process one step further. This third step is 
characteristic of the general case. 

Either 2iU 2 2 = 5, or else there exist vectors x' which 
are not in 2i U2I 2 . For each such vector x! there exists 
an integer e such that [1(A)] 9 -x' is in the union, while 
[l(A)] e ~ l x' is not in the union. Let x" be one such vec- 
tor whose corresponding integer e is maximal. Since in 
the previous step ^ was chosen with a maximal k, it will 
be true that 0 <e^k^h. 

There will exist a relation 

l/(>t)J*-x" = tu(A)-4> + gi(A) •*. 

We can write 

*i(*) = [/(*)j*?i(*) + fi(*), 
gs(*) = [*(*)]*•?«(*) + r t (x) 
where each of ri(x) and r 2 (jc) is 0 or of degree <je. Then 

[KA) ]*x" = [^)]*-‘[gi (A)-+ + gi (A) ■+], 

[l(A)] k x" ~ [/(/«)]*-*•«, M)-0- [l(A)] k q t (A)^ 

= [KAW-r^A) i. 

By the definition of k , and the fact that <f> is in 2i, the 
entire left member of the above equation is in 2 lf while 
the right member is in 2 2 . But 2in2 2 = 0 so that each 
member is 0. Then by Corollary 61a, [l(x)] k ~ 9 r 2 (x) is 
divisible by [/(*)]*, whence r 2 (#)=0. 
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We now have 

« [/(Jl)]*-[*i(i0-+ + gM) ■*), 

= m(A) ■ qi(A) ■ <j> + [K A )] h ~ t r i(A) <l> 

+ m(A)-q 2 (A) yp. 

That is, [l(A)] h ~ e • ri(A) <£ = 0. Hence [l(x)] h ~ e • ri(x) is 
divisible by [/(*)]*, and consequently ri(x)=0. 

We define x by the equation 

X = x" - qi(A) <t> - q%(A)-}p. 

Then cle&rly 

[/(yl)]*-X = 0. 

If [l(A ) ]* -1 ■ x were in SiWZ 2 , we should have 
[l(A)]'-'x" - [l(A)}'-'- qi (A).<t>+ [l(A)}‘-'- qi (A) t 
+ [/(/t)]-'-X 

which would be in 2 iU 2 2 , contrary to the definition of 
X". Thus x is a vector whose corresponding integer e is 
maximal, and such that 

[f(^)j*x = o, [HAft-'-xnotin^UZi. 

We now consider the subspace Zj spanned by the 
vectors 

X. Ax, A*x, • • • , A’~ l x, Xu Ax i, A 2 xu • • • . A*- l xu 
(38) 

• • • , X«-l. Ax.-U A*Xe-l, • • • , A’~ l x—i 

where x<= [f(^)] <- x* As before we prove that these 
vectors are linearly independent among themselves. 

Suppose that there were a linear relation among the 
vectors (33), (37) and (38). It would take the form 

f(A) x = gi(A) 4> + gi(A) f 
where /(*) is 0 or of degree <je, gi(x) is 0 or of degree 
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<jh and g 2 (x) is 0 or of degree <jk. If f(x)^ 0, we may 
write 

/(*) = [/(*)]• 0 g i < e 

where fi(x) is relatively prime to l(x) and hence to m( x), 
so that fi(A) has the inverse fl\A). Then 

[^i)]’-x = /7V v-gi(A) * +f:\A) ^A) •+ 

where i<e. But for i<e the vector [/(i4)]* x cannot be 
in the union SiUS 2 . Hence the vectors (33), (37) and 
(38) are linearly independent. 

Since [/(i4)] e -x = 0, the space S 3 is invariant under 
A. 

As long as there remains a vector which is not in the 
union of the invariant spaces 

~1» ^2i ^3» • 

we can find another invariant space whose basic vectors 
are of the same form as (33), and are linearly independ- 
ent of the basic vectors of the union of the spaces previ- 
ously obtained. In a finite number of steps we obtain 
the total space 5 expressed as a supplementary sum of 
such subspaces, 

S = 2,+2 2 + ••• + 2,. 

We have proved 

Theorem 66. If the matrix A has the minimum func- 
tion [/(*)]*! where l{x) is irreducible , the total vector space 
S can be written as a sum of subspaces 

S — 2i + 2* + • • • + 2* 

each of which is an invariant space of A , and each of which 
has a basis of the form (33) with h^h\. 

48. The rational canonical form. If S=Zi+Z 2 + * * • 



134 


THE RATIONAL CANONICAL FORM 


+ 2* is any decomposition of 5 into subspaces each of 
which is an invariant space of A, and if 

& 1 1» * * » O’ i r, 

is any basis for we may form the matrix P whose col- 
umn-vectors are 


*11. • • • f 01r x t 0*21, • ’ ' f *2rif ' ' ' » *«1» * * " t (Tir t - 

By Lemma 62 

P-'AP + 

where J5, is of order r,. Then by ( k ) of §42 
;»(.'!) = m(P~ l AP) = »/(«,) + »/(B 2 ) + • • • + »/(£,) 
so that 

»(*.) = 0 , 

and the minimum function of B l is a divisor of the mini- 
mum function [/(.v)]* of A. But it may not be true that 
each B t is non-derogatory. 

If, however, we assume the particular decomposition 
of S described in Theorem 66, and if we take (33) as the 
basis for 2i, (37) as the basis for 2S 2 , (38) as the basis for 
2S 3 , etc., we can show that in the resulting expression 

P-L4P = B x + B 2 + • • • + B h 

every matrix is non-derogatory and is of the type 
(35). 

To make the notation less complex, let us assume 
that 

h = 9, m(x) = [/(*)]*. l(x) = /o + hx + hx 2 + * 3 , 
P = [0, A4>, A*4 > » 0i» -I0i» A 2 4> 1, Aip, Aty] 
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where <j> is a vector in the null space of [/(^4)] 2 which is 
not in the null space of 1(A), where ti=l(A)t, and 
where ^ is a vector determined as in §46. Then (sec §45) 

AP= [A<f>, A 2 t, — lot—lxAt—ltA 2 t~\~tu Ati, A 2 ti, 

—loti — hAti—ltA 2 ti, A\f/, A*t, —list—liAt—hA^t] 
= [t, At, A 2 t, tu AtuA'tu t, At, A 2 t]B=PB 
where 


0 

0 

-lo 

0 

0 

0 

0 

0 

0 • 

1 

0 

-h 

0 

0 

0 

0 

0 

0 

0 

1 

-h 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

-lo 

0 

0 

0 

0 

0 

0 

1 

0 

-h 

0 

0 

0 

0 

0 

0 

0 

1 

-It 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

-lo 

0 

0 

0 

0 

0 

0 

1 . 

0 

-h 

.0 

0 

0 

0 

0 

0 

0 

1 

-h. 


It should be noted that this matrix differs from (35) 
only in having a 0 instead of a 1 in the 7th row and 6th 
column. That is, (39) is a direct sum of a matrix of type 
(35) of order 6 and another of order 3. 

In general, we shall have 

P-'AP - + £,+ ...+ B« 

where each matrix is of type (35), and hence is non- 
derogatory (see §45). The minimum function of 2?i is 
[f(*)]* (see §46), the minimum function of Bt is [/(*)]* 
(see §46), the minimum function of B t is [/(*)]• (see 
§47), etc. 
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Theorem 67. If the matrix A has the minimum func- 
tion [/(.r) ] h where l(x) is irreducible , A is similar to a direct 
sum of non-derogatory matrices each of which has as its 
minimum function a divisor of [/(*)]*• Each of these non- 
derogatory matrices is of type (35). 

By combining this result with Theorem 62, we have 
the general structure theorem: 

Theorem 68. Let A be a matrix whose minimum func- 
tion is 

m(x) = [/i(*)]*»- l/ 2 (*)]** • • • [M*)]**. 

Then A is similar to a direct sum of non-derogatory mat- 
rices each of which has as its minimum function a divisor 
of one of the polynomials [/»(*)]*». Each matrix in this 
direct sum is of the type (35). 

A matrix of the form described in the above theorem 
will be said to be in the rational canonical form for a 
matrix under similarity transformations. 
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ELEMENTARY DIVISORS 

49. Equivalence of matrices. Consider the set of all 
polynomials in x with coefficients in a field F. This set 
is called the polynomial domain of F, and is denoted by 
F[x]. 

The domain F[jc] has many of the properties of the 
set of rational integers. It is closed under addition, sub- 
traction and multiplication, but not under division. 
Those polynomials which are divisors of 1 are called 
units , and it is clear that the units of F[jc] are the poly- 
nomials of degree zero — that is, the numbers 5^0 of F. 

Those polynomials which are neither 0 nor units fall 
into two classes, prime or irreducible polynomials, and 
composite or reducible polynomials. Every reducible 
polynomial can be represented as the product of a finite 
number of irreducible polynomials and, except for unit 
factors and the order of the factors, this representation 
is unique. Two polynomials whose quotient is a unit are 
said to be associated . 

A matrix P with elements in F[jc] is called unimodu - 
lar if it has an inverse whose elements are likewise in 
F[*]. Clearly PP~ l = I implies 

(f(P)d(P- 1 ) = 1. 

If P and P~ l have elements in F[s], both d(P) and 
d(P~ l ) are in F[jc] and are units. If, conversely, d(P) is 
a unit, the elements of P~ l are in F[x]. Thus P is uni- 
modular if and only if its determinant is a unit of F[*]. 

Two matrices A and B with elements in F[#] are said 
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to be equivalent in F\x\ if unimodular matrices P and Q 
exist such that 

B = PAQ. 

Clearly the above equation implies 
A = 

so that equivalence is a symmetric relation. If also 
G—RBS where R and S are unimodular, then 

C = RPAQS. 

Since RP and QS are unimodular, A and C are equiva- 
lent. Thus equivalence is transitive. 

The theory of elementary transformations applies 
mutatis mutandis to matrices with polynomial elements. 
The three types ar^: 

Type I: The interchange of two rows or of two col- 
umns. 

Type II: The multiplication of the elements of any 
row, or column, by the same unit. 

Type III: The addition to any row, or column, of a 
multiple by a polynomial of F[x] of another row, or col- 
umn. 

The inverse of each elementary transformation is an 
elementary transformation of the same type. A trans- 
formation of Type I merely changes the sign of the de- 
terminant of the matrix; one of Type II multiplies the 
determinant by the unit; while a transformation of 
Type III leaves the determinant unchanged. Thus a uni- 
modular matrix is transformed into a unimodular matrix 
by every elementary transformation. 

Every elementary transformation on the rows (col- 
umns) of a matrix A can be accomplished by multiply- 
ing A on the left (right) by a unimodular matrix. The 
proof is similar to the proof of Theorem 13. 
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The theory connected with the Hermite canonical 
form may be carried over with only slight modifications. 
If A is a square matrix, there exists a non-singular 
matrix P which is a product of elementary matrices such 
that PA is triangular with every diagonal element 
either 0 or a polynomial with leading coefficient 1 ; if the 
diagonal element in any row is 0, the entire row is 0; if 
the diagonal element in any column is of degree d % every 
other element of the column is 0 or of degree <d with 
leading coefficient 1. This form is unique. 

Since the Hermite form of a unimodular matrix is /, 
it follows as in Theorem 20 that every unimodular 
matrix is equal to a product of elementary matrices. 


50. Invariant factors. Let A be a matrix whose ele- 
ments are polynomials of F[x], and let 


,<*) 


( k > „(*> 


\ B\ B i 


be all of the fc-rowed minor determinants which can be 
formed from A. If A is of rank r, and if k^r, there will 
exist at least one &-rowed minor determinant of A which 
is not equal to zero, so that there will exist a (non-zero) 
greatest common divisor of all the ft-rowed minor de- 
terminants of A. This polynomial, chosen so that the 
coefficient of the leading term is 1, we shall call the &-th 
determinantal divisor of A , and denote by rf* . 

If k exceeds the rank r of A, there exists no deter- 
minantal divisor </*. 

We know that, if A and B are equivalent, they have 
the same rank r. We shall show that they have the same 
determinantal divisors 

d u </*••• » d T m 
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be the fc-rowed minor determinants of A and J3, respec- 
tively, and let d* be the g.c.d. of the A and di be the 
g.c.d. of the B™. 

Let B be obtained from A by an elementary trans- 
formation of Type I. Then the are merely a permu- 
tation of the A (l i ] so that di = d*. 

Let B be obtained from A by an elementary trans- 
formation of Type II. Then each determinant B ( \ ] is 
either equal to the corresponding A ( f or differs from it 
by a unit factor. Thus di = d*. 

Let B be obtained from A by an elementary trans- 
formation of Type III. Then each is either equal to 
the corresponding A™ or is of the form 


„('■> i (to) i «<*> 

B t 9 = . 1 , + m. 1 j 


where m is in F[x], Clearly d k is a divisor of all the B (k t 
and hence is a divisor of di . Since the inverse of an ele- 
mentary transformation of Type III is an elementary 
transformation of Type III, di is a divisor of d k and, 
since each has 1 for its leading coefficient, d* =di . 

Since every unimodular matrix can be written as a 
product of elementary matrices, it follows that, if 
B ~PAQ where P and Q are unimodular, A and B have 
the same determinantal divisors di, tf 2 , • • • , d r . 

If j 4,<* +1) is a (£ + l)-rowed minor determinant of A , 
we can by the Laplace expansion (Theorem 35) express 
it as a linear combination of ft-rowed minor determin- 
ants of A. Hence d* is a divisor of d*+i for k = 1, 2, • • • , 
r— 1. The quotients 


dr _ rff-l 

dr-1 ~ r ’ dr-2 


hr— 1 » 



di = //i 


are therefore polynomials in x 9 with leading coefficients 
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equal to 1, which determine the determinantal divisors 
uniquely, and are in turn uniquely defined by them. 
These polynomials 

//l, // 2 , * * * i 

none of which can be 0, are called the invariant factors 
of A. 

We have proved 

Theorem 69. If two matrices with elements in 
are equivalent , they have the same determinantal divisors 
and the same invariant factors . 

51. A canonical form. Let A be a square matrix of 
order ?i and rank r with elements in F[x]. By means of 
elementary transformations on the rows and columns 
we may reduce A to a canonical form. 

Let 



an 

aio * 

* a in 

= 

a 2 i 

a 22 • 

• • a 2 n 


— a,ti 

a n2 * 

’ * O nnmm 


If r>0, we can permute the row r s and columns so that 
an 5^0 and the degree of an does not exceed the degree 
of any element of A. Let ai, be any element of the first 
row r s^an. We may w r rite 

On = qon + t 

w T here either / = 0 or t is of lower degree than an. Then if 
we subtract from the i-th column q times the first col- 
umn, the new element in the place of au will be t. If 
ty* 0, it can be moved into the an position and we may 
start again with an an of lower degree. In a finite number 
of steps w*e shall have a new matrix A with an ^0 and 
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every other element of the first row and of the first col- 
umn equal to 0. 

Unless an divides every element of this new matrix 
A, its degree can be still further reduced. Suppose the 
element in the (t, j)-position in the new matrix A to be 

0.7 = q a n + t 

where / is of lower degree than an. If we add the ith row 
to the first row, we have placed this polynomial a tJ in 
the dij position, and as before we can replace an by /. 
We may now assume that A has the form 


A l 


“a n 0 0 • • ■ 0 “ 

0 a 22 023 * ’ • 02 n 

® 032 0.33 * * ’ 03n 


Lo 0«2 0«3 ‘ * * 0un -J 


where an is a divisor of every element of A . If r> 1, we 
may now ignore the first row and first column, and in a 
similar manner reduce A to the form 


A 2 


“an 0 0 0 • • • 0 “ 

0 022 0 0 • • • 0 

0 0 033 034 • • * 03 n 

0 0 O43 O44 • * • 04 n 


LO 0 0n3 0n 4 * * * 0»nJ 

where 022 is a divisor of every element of Az except per- 
haps an. Since A 2 was obtained from Ai by elementary 
transformations, every clement of A 2 is equal to a linear 
homogeneous function of the elements of Ai. Since an is 
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a divisor of every element of Ai, it is a divisor of every 
element of A 2 . 

By a continuation of this process we eventually ob- 


tain 


A r = 


~an 

0 • • 

• 0 • • 

• 0 - 

0 

a 22 • • 

• 0 • • 

• 0 

0 

0 • • 

• a„ • • 

• 0 

_0 

0 • • 

• 0 • • 

• 0_ 


where a it divides a t +i f »+i- If the rank of A is r, every ele- 
ment of A r whose row and column indices exceed r must 
be 0, for otherwise a minor determinant of order r + 1 
could be formed which would not be equal to 0. 

Since an divides a 2 2 , a 22 divides a 33 , etc., it is clear 
that the determinantal divisors are 

d r ~ana 22 • • • a rrt d r -\ = and22 • • • a r _i f ,_i, • • • , rfi = an 
and that the invariant factors are 

h\ == an, h 2 ^ a 22 , • • • , h r = a rr . 

We have proved 

Theorem 70. Every matrix A of order n and rank r 
with the invariant factors hi, h 2 , • • • , h r is equivalent to 
the diagonal matrix 

r hi 0 • • • 0 • • • On 

0 h 2 • . . 0 • • • 0 


0 0 • • • h r • • • 0 


L0 0 — - 0 " - OJ 
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Corollary 70. Each invariant factor h x is a divisor 
of the next invariant factor h x + 1 . 

We may now prove the important theorem, 

Theorem 71. Two matrices A and B of order n with 
elements in /?(*:] are equivalent if and only if they have 
the same invariant factors . 

By Theorem 69 two equivalent matrices have the 
same invariant factors. By Theorem 70 two matrices 
having the same invariant factors are equivalent to the 
same matrix and hence are equivalent to each other. 

52. Elementary divisors. Let l u / 2 , • • ■ ,l p be the ir- 
reducible polynomials, each with leading coefficient 1, 
which divide at least one of the invariant factors h x . It 
is then possible to wfite 


where each exponent e is a positive integer or 0. The 
prime power factors 


h » 


f lp 

i * p » 


r l 

i > 




whose exponents are actually greater than zero are 
called the elementary divisors of the matrix A. 

It is obvious that the invariant factors determine the 
elementary divisors uniquely. It is true, conversely, that 
the elementary divisors (without regard for their order) 
and the integer r determine the invariant factors 
uniquely. Since h r is divisible by each of the other V s, 
h r is the product of the highest power of h by the highest 
power of 1% • • • by the highest power of l p which occur 
among the elementary divisors. As these are used, they 
may be crossed out. Then A r -i is the product of the 
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highest power of h which remains by the highest power 
of k which remains, etc. After the elementary divisors 
are all used up, the remaining h's are equal to 1. 

Thus if r = 4 and the elementary divisors are 

x\ * 8 , x, (x - l) 8 , (x - l) 2 , 
the invariant factors are 

hi = x\x — l) 3 , ha = x 3 (x — l) 2 , h 2 = x, hi = 1. 

We have proved 

Theorem 72. Two matrices have the same invariant 
factors if and only if they have the same rank and the same 
elementary divisors . 

It is not immediately evident that the elementary 
divisors are more useful than the invariant factors. Their 
introduction is thoroughly justified, however, by the 
following theorem. 

Theorem 73. If A is equivalent to the matrix 

r gi o • • • o • • • on 

0 g 2 • • • 0 • • • 0 


0 0 • • • gr • • • 0 


Lo 0 • • • 0 • • • 0 J 

where each g x j* 0 but it is not necessarily true that gi divides 
gi+i, then the prime power factors of the g’s are the ele- 
mentary divisors of A 

Let / be any prime factor of any one of the g’s, and 
arrange the g’s according to ascending powers of this 
prime factor. That is, 

g. t = fll\ git = fil\ •••.«<, = frl ir 
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where the f s are prime to l and 

k\ £ k* £ •• • ^ k r . 

It is evident that the highest power of l which divides 
the determinantal divisor d x has the exponent 

k\ + k 2 + • • • + ki . 

Hence the highest power of / which divides the invariant 
factor h t is /**, so that for every i for which k t ^l this 
prime power is an elementary divisor of A. If this argu- 
ment is repeated for every prime which divides a g, we 
shall see that every prime power factor of every g is an 
elementary divisor of A, and all elementary divisors of 
A are thus obtained. 

This theorem is of great practical importance. Con- 
sider the matrix 



~x*(x - l) 2 

0 

o- 

A = 

0 

x(x - l) 3 

0 


0 

0 

X- 


The elementary divisors are 

x 2 , ( x - l) 2 , x , ( x - l) 3 , x 
so that the invariant factors are 

hz = x 2 (x — l) 8 , h 2 = x(x — l) 2 , hi = x . 
Hence A is equivalent to the matrix 


~x 

0 

0 

0 

x(x - l) 2 

0 

-0 

0 

* 2 (z - l) 3 - 


The actual reduction of A to this form would entail con- 
siderable calculation. 
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53. Elementary divisors of a direct sum. The follow- 
ing generalization of Theorem 73 will prove useful. 

Theorem 74. If 

A = A i + A 2 + • • • + A t , 

the elementary divisors of A are the elementary divisors of 
all the matrices A x taken together . 

By Theorem 71 we can find non-singular matrices P t - 
and Q x such that 

P X A X Q X = A! 

where A[ is diagonal. Define 
P = Pi + P 2 + • • • + Pt, Q = Qi + <? 2 + • • • + Qt. 
By (d) of §42, 

PAQ = A{ +A( + ... +At =A', 
and by (g) of §42, both P and Q are non-singular. 
Clearly A ' is diagonal. By Theorem 73 the elementary 
divisors of the A x are the prime power factors of the di- 
agonal elements of the A[ , and by this same theorem 
the elementary divisors of A are the prime power factors 
of all the A[ taken together. 

Consider the canonical form (35) which was of so 
much importance in Chapter VI. Its characteristic 
matrix is 


— x 

0 

-/ 0 

0 

0 

0 

0 

0 

0 • 

1 

— * 

~h 

0 

0 

0 

0 

0 

0 

0 

1 

H 

1 

1 

0 

0 

0 

0 

0 

0 

0 

0 

l 

— ac 

0 

— l 0 

0 

0 

0 

0 

0 

0 

1 

— X 

-l 1 

0 

0 

0 

0 

0 

0 

0 

1 

1 

1 

H 

0 

0 

0 

0 

0 

0 

0 

0 

1 

— * 

0 

-/ 0 

0 

0 

0 

0 

0 

0 

1 

— X 

-z 1 

0 

0 

0 

0 

0 

0 

0 

1 

—h—x. 
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whose determinant is the 9th determinantal divisor d 9 - 
By the Laplace expansion (Theorem 35) 


dg — 


— X 


1 

0 


0 - 1 0 

— x — l\ 

1 — h — x 


z 


and by §31, 


i/(*)j 3 . 


Now if we cross out the first row and last column of 
the matrix (40), we obtain a matrix having nothing but 
0’s below the main diagonal and l’s in this diagonal. The 
determinant of this matrix is 1, so that the 8th deter- 
minantal divisor is dl= 1. In fact 


d\ = di — • • • = Jg = 1, 


Thus the invariant factors are 

hi = h 2 = • • • = h s = 1, h 9 = [l(x)\ z 

and the only elementary divisor is [/(x)] 3 . 

In general, we have 

Theorem 75. Every matrix of the form (35) has a 
characteristic matrix with a single elementary divisor. If 
l(x) is of degree j and if (35) is of order jh, this elementary 
divisor is [/(#)]*• The canonical form of Type (35) is 
uniquely defined by its elementary divisor. 

54. Similar matrices. Let A and B be matrices with 
elements in the field F which are similar in F. That is, 
there exists a non-singular matrix P with elements in F 
such that 


B = P~ l AP. 



SIMILAR MATRICES 


149 


Then 

- xI)P = P~ l AP - P-'xIP = B — xl 

so that the characteristic matrices of A and B arc equiv- 
alent in the domain F[.v]. It is then evident from The- 
orem 69 that, if A and B are similar in F, their char- 
acteristic matrices are matrices with elements in F[x] 
which have the same invariant factors. The converse, 
which is not so evident, will now be proved. 

Theorem 76. Two matrices A and B with elements in 
F are similar in F if and only if their characteristic 
matrices are equivalent in F[#] — that is, if and only if 
they have the same invariant factors . 

The necessity of the condition has just been proved. 

In Chapter VI it was proved that every matrix A 
with elements in F is similar to a direct sum of matrices 
Ai of Type (35). By Theorem 75, each matrix of Type 
(35) has a single elementary divisor and is uniquely de- 
termined by it. By Theorem 74, the collection of these 
elementary divisors of the matrices A , is the set of ele- 
mentary divisors of A. The order of the components Ai 
in the direct sum can be arranged at will by a similarity 
transformation so that, if A—xI and B—xI have the 
same elementary divisors, it must be true that their 
canonical forms described in Theorem 68 can be taken 
to be identical. Then A and B are similar in F . 

It is now clear that two matrices with the same 
minimum function are not necessarily similar. Thus let 
A be a matrix with elements in F whose minimum func- 
tion is 

m(x) = [h(x)\ h '- [h(x)\ h * • • • [/*(#) ]** 
where the polynomials /»(#) are irreducible and rela- 
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tively prime in pairs. The elementary divisors of A — xl 
are the factors [/«(*) ] A » and divisors of these factors. 

Consider for a moment the matrix considered in §45 
of order n and index n having the minimum function 

m(x) = [l(x)] k n = jh, 

so that l(x) was of degree}*. We chose the vector 4> to be 
any vector in the null space of [/(<4)] A which was not in 
the null space of [/(j 4)]*“ 1 , and by means of 0 con- 
structed a subspace 2i of the total space S. Then in §46 
we chose a vector ^ such that [l(A)] k \l/ was in 2x, but 
to]*- 1 * was not in Si, and such that k was maximal, 
and constructed another subspace 2 2 . We then showed 
that [/(*)]* was another elementary divisor of A—xI . 
Then in §47 we chose another vector x such that 
[Z(i4)]* x was in 2xW2 2l while [/(yl ) ] e—1 • x was not, and 
such that e was maximal. This led to another elementary 
divisor [/(*)]•, etc. 

In §§45-47 it appeared that the vector ip depended 
upon the choice of 0, that the vector x depended upon 
the choice of both <t> and etc. It seemed probable that 
the value of k also depended upon what vector 4> was 
selected, and that e depended upon both 4> and We 
now see that the integers h, k, e, • • • are invariants of 
the canonical form and thus are independent of the 
choice (within their range of definition) of the vectors 
X. • • • • 

These integers A, A, e, • • • constitute the Segri 
characteristic of A relative to the prime factor l(x) of the 
minimum equation. 

55. The Weyr characteristic. Let A be a matrix with 
elements in F having the minimum function 

m(x) = [/,(*)]»«• [/,(*)]»* • • • [/*(*)]** 



THE WEYR CHARACTERISTIC 


151 


where the /,•(*) are irreducible and distinct. Let 

gilt gil + gtl t gil + git + gtSt * ‘ * » gil + git + ’ * * + gxhi 

denote the nullities (see §40) of the respective matrices 

IM), 1/.04)]*, • • • . M)]*-. 

Since the nullity of a matrix is its order minus its rank, 
the integer g ik indicates the drop in rank between the 
matrix [/•(i4)]*~ l and the matrix [Z,(i4)] fc . 

Theorem 77. Every integer g xk is divisible by the de- 
gree ji of the irreducible polynomial /,(*). 

It is immediately evident from (30) of §41 that these 
integers g ik are invariant under similarity transforma- 
tions upon A . By Theorem 68, A is similar to a direct 
sum of non-derogatory matrices each of which has a 
minimum function of the type [/(*)}*. That is, 

A =L A i 4* • • • 4 i 4 i+i + • * * 4 ‘ A v 

where Ai, • • • , A t have minimum functions which are 
powers of /<(*), while At+i, • • • , A, have minimum 
functions which are prime to /,(*). By (30) and (k) of 
§42, 

lfc04,)]*+ ••• + 

"I" [fiC^M-l)]* 4- • • * 4" [W4,)]*. 

By Theorem 44 each of the matrices 
[W»i)]\ • • • . 

is non-singular and hence of nullity zero for every value 
of k, while by Corollary 57 each of the matrices 
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is of nullity j x k if j x k is less than or equal to its order, 
otherwise it is a zero matrix. 

The nullity of a direct sum is obviously equal to the 
sum of the nullities of the components (see (h) of §42), 
whence it follows that the nullity of [Z,(i4)]* is divisible 
by j xy and so therefore is every integer g X k . 

Set gtk = Wikjt- The integers 


W,i, W x 2 , * • • , U’thi 

constitute the Weyr characteristic of the matrix A rela- 
tive to the irreducible factor l x (x) of its minimum func- 
tion. 

» 

Theorem 78. The Segri characteristic and the Weyr 
characteristic of the matrix A relative to the same irreduci- 
ble factor l(x) of its minimum function are conjugate par- 
titions of the same integer. 

The avoid elaborate notation we shall take a special 
example which is typical of the general situation. Let A 
have the Scgre characteristic (5, 4, 2, 2) relative to the 
irreducible factor l(x) of degree.; of its minimum func- 
tion. Then 

A = Ai-\- A 2 + As + A a + ••• 

where each of the A's is non-derogatory. The minimum 
functions of Ai, A* x A 3, A a are by Theorem 68 respectively 

[/(*)]». [/(*)]«, [/(*)]*, [/(*)]*, 

while the matrices represented by dots have minimum 
functions which are prime to l{x). Then for every i 

[/Mi )] 4 + [/M*)]‘+ [/M»)]‘ + [/Mi )] 4 + Bi 
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where B x is a non-singular matrix. By Corollary 57 
KA 0, l(A*) 9 l(A a), 1(A a ) 

are each of nullity j so that gi = 4; and wi = 4. Then 

[*M.)] S , [/(/t 4 )] J 

are each of nullity 2 j so that 


gi + g 2 = 87 , w 2 = 4. 

But the last two of the above matrices are 0 so that there 
is no further increase in nullity when the exponents of 
l(A 3 ) and l(A 4 ) are increased. Thus 

gi + £2 + £3 = 10/, gi + g 2 + ga + g4 = 12y, 
w 3 = 2, w A = 2. 

Now [/(-4 2 ) ] 4 = 0 so that 

gl + g2 + g3 + £4 + g5 = 137, W h = 1. 

There are no further non-zero w' s. 

Now let us arrange the integers of the Segr6 char- 
acteristic as rows of dots, 


The numbers of dots in the columns, namely (4, 4, 2, 2, 1), 
are the numbers of the conjugate partition of 13. That 
they constitute the Weyr characteristic is obvious, for 
the increase in nullity in passing from [/(i4)] <-1 to 
[/(-4)]* is equal to the number of matrices Ai whose 
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minimum functions are powers of l(x) having an expo- 
nent ^ i — that is, the number of dots in theith column 
of the above display. 

A really typical numerical example would require a 
matrix of quite high order, but the example of §32 will 
be of some assistance. We have 



which is of rank 3 and whose characteristic function is 
/(*) - - 3)’(* - 12). 

The matrix 



is of rank 1 so that gn = 2 and wn = 2. Since [A —3 1] 2 is 
also of rank 1, Wn does not exist. 

Furthermore 


r 5 4 

A - 12/ = 4 - S - 1 

is of rank 2, and so is its square. Thus g*i = u>u = 1 . 

The conjugate partition of 2 is 1 + 1, and the con- 
jugate partition of 1 is 1. Thus A has the elementary 
divisors 


x — 3, x — 


*-12 
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and the invariant factors 

hz = (x — 3)(x — 12), h 2 = x — 3, hi = 1. 

The canonical form is 

[ 3 0 0- 

0 3 0 

0 0 12 - 

and the minimum function is 

m(x) = hz = (x — 3)(s — 12). 

56. Collineations. Since the early work in elemen- 
tary divisor theory was motivated by projective geome- 
try, it may not be out of place to see what the connec- 
tions between these two topics are. 

Referred to a system of trilinear homogeneous co- 
ordinates, the equations 

x{ = dnXi + dnXz + dizXz , 

X{ = dz\X\ + dzzX 2 + dzzXZt 

+ 032*2 + dzzXz 

define a projective transformation or collineation of the 
points of the projective plane. We shall assume that the 
elements a< f as well as the x’s belong to the real field R. 

The above transformation may be written in the 
form 


i' = At 

where £ is the vectpr (x u x 2 , x z ), £' = (*i , x{ , x{) and 
A = (a r .) is the matrix of the collineation. If is 

another collineation, then evidently 

£" = BA$ 
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is the resultant collincation. The matrix of the resultant 
of two collineations, then, is equal to the product of the 
matrices of these collineations in reverse order. This is 
one way in which the concept of product of two matrices 
can be introduced. 

Let us suppose that a new system of trilinear coordi- 
nates is chosen. Every point with coordinates (*i, # 2 , *3) 
in the old coordinate system will have coordinates 
( yu y2, ys) in the new coordinate system, and vice 
versa. There will exist a set of equations 

xi = tny\ + *i2y2 + *i3y3, 

x 2 = / 2 iyi + ^22y2 + 

^3 = h\y\ + f32y2 + 

relating these sets of coordinates. If we denote (yu y*> y*) 
by 1 j, these equations may be written 

Tv 

where r=(/„) is non-singular. Conversely every non- 
singular matrix T defines a coordinate system. 

The point (x { , x { , xl ) will also have coordinates 
(yl , y % , y { ) in the new coordinate system. Then 

= jy, v' - (y(,yi,yi)- 

That is, 

T V ' = - Ai - A Tv 

so that 


v' = T- l ATv. 

We have proved 

Theorem 79 . If a collinealion has the matrix A in one 
trilinear homogeneous coordinate system , it has the similar 
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matrix T~ l AT in every such coordinate system; and con- 
versely every matrix similar to A is the matrix of this col - 
lineation in some coordinate system . 

Two vectors are the homogeneous coordinates of the 
same point if and only if they are proportional. Thus 
(*i, x 2l Xz) is left invariant by a collineation if and only 
if there exists a non-zero real number X such that 

x{ = X#i, x{ = X.r 2 , xi = Xa? 3 . 

Hence X must satisfy the linear homogeneous equations 

(a u - X)*! + 012*2 + 013*3 = 0 , 

(41) 0 2 i*i + (a 22 — \)x 2 + 0 23 *3 = 0, 

031*1 + 032*2 + (033 - x)z 3 = 0. 

The number of solutions will depend upon the rank of 
the matrix 


A - X/. 

In order that there may be any solution at all with 
(*i, * 2 , * 3 ) not all zero, it is necessary that X be a solution 
of the characteristic equation 

| A - X/ | = 0 

of the matrix A . 

If Xi is a characteristic root, then A — Xi / is singular. 
If A — Xi/ is of rank 2, equations (41) with Xi substituted 
for X have but one linearly independent solution, so that 
corresponding to Xi there is just one invariant point of 
the collineation. If, however, A — Xi / is of rank 1, equa- 
tions (41) have two non-proportional solutions 


(0 ii 02 » 03)* (^i» b 2% bz)t 
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and all vectors of the form 

( ka\ + lb u k<i 2 + /& 2 » ha* + /6s) 

are solutions where k and l are arbitrary. In this case 
there is a whole line of invariant points. If A —Xj I is of 
rank 0, every point of the plane is invariant. Thus the 
number of invariant points depends upon the first com- 
ponent of the Weyr characteristic corresponding to each 
real linear factor of the characteristic equation of A . 

In the plane case, which we have chosen, 

I A - X/ ! = 0 

is a cubic equation and hence has at least one real root. 
If the other two roots are complex, there is just one in- 
variant point. But if all three roots are real, there are 
various possibilities depending upon the elementary di- 
visors of A —X/. 


Elementary Segre 

Divisors Char. 


Weyr 

Char. 


Invariants. 


X— Xi, X— X 2 , X— X 3 [1, 1, 1] 
X-Xi,X-X lt X-X 2 [(1, 1), 1] 
X— Xi, X— Xi, X— Xi [(1, 1, 1)] 
(X-X,) 2 , (X— X 2 ) [2,1] 

(X-XO 2 , (X-XO [(2,1)] 

(X-X,) 2 [3] 


[l, 1, l] 3 distinct pts. 
[2, l] line and pt. 
[3] the plane. 

[( 1 , 1 ) , 1 ] 2 distinct pts. 
[(2,1)] one line 
[(1, 1, 1)] one point. 


In the second case, for instance, there are two dis- 
tinct characteristic roots. Relative to the first root there 
are two elementary divisors X— Xi, X— Xj, so that the 
Segre characteristic is (1,1). Relative to the second char- 
acteristic root the Segre characteristic is 1. The com- 
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plete Segre characteristic, then, may be written 
[(1, 1), l], The conjugate partition of (1, 1) is 2 so that 
the complete Weyr characteristic is [2, 1 ]. The integer 
2 indicates that A is of nullity 2 or rank 1, so that 
there is a line of invariant points corresponding to the 
characteristic root Xj. Since A — XjJ is of rank 2, it leads 
to an invariant point which is not on the line of invari- 
ant points. The other cases are similar. 
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57. Orthogonal matrices. Consider the ordinary 
three-dimensional euclidean space with two sets x, y , z 
and x\ y\ z ' of orthogonal axes having the same origin. 
If a point has the coordinates (a, b , c) relative to the first 
set of axes, and the coordinates (a\ b\ c ') relative to the 
second set, it is true that 


a' = a cos (.t'ar) + b cos (x'y) + c cos ( x'z ), 

b' = a cos (y' x) + b cos ( y f y ) + c cos (y'z)» 

c' = a cos ( z f x ) *+ b cos (z'y) + c cos (z'z) 


where cos (x'x) is the cosine of 
jc-axis and the #'-axis, etc. 

The matrix 

[ cos (x'x) cos (x'y) 
cos ( y'x ) cos (y'y) 
cos ( z'x ) cos (z'y) 


the angle between the 


cos ( x'z ) 
cos ( y'z ) 
cos (z'z). 


has some distinctive properties. The elements ot each 
row are the direction cosines of one of the new coordi- 
nate axes relative to the old axes so that the sum of the 
squares of the elements of each row is 1. Since the axes 
are mutually perpendicular, the inner product of any 
row vector by any other row vector is zero. Moreover 
the elements of each column are the direction cosines of 
one of the old coordinate axes relative to the new sys- 
tem, so that the sum of the squares of the elements of 
each column is 1, and the inner product of two different 
column vectors is 0. 
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In general, a matrix with elements in a field F is 
called an orthogonal matrix if its transpose is equal to its 
inverse — that is, A is orthogonal if AA T = I. This im- 
plies that A is non-singular, and that 

w 

2Z a rt a $t — 8 r »» 

«-i 

This means that the sum of the squares of the elements 
of each row vector is 1, while the inner product of two 
different row vectors is 0. The above conditions are like- 
wise sufficient that A be orthogonal. 

Since A T = A~ l , it is also true, if A is orthogonal, that 
A T A =/. That is, 


n 



Hence the sum of the squares of the elements of each 
column vector is 1, and the inner product of two differ- 
ent column vectors is zero. These conditions also char- 
acterize an orthogonal matrix. 

An orthogonal matrix of order n, then, may be 
thought of as a transformation in n-dimensional eu- 
clidean space from one set of mutually orthogonal axes 
to another such set having the same origin. 

The product of two orthogonal matrices is orthog- 
onal. For if 

AA T = /, BB T = /, 

then 

(AB)(AB) T = ABB T A T = AA T - /. 

The inverse of an orthogonal matrix is orthogonal, for if 
AA T = I, 

c A~ l ) T A -» = (A T )-'A~ l = (AA T )-' = I. 
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Clearly the identity matrix / is orthogonal. Thus the 
orthogonal matrices of order n form a group relative to 
multiplication. 

58. Orthogonal bases. Two vectors 
0 = (»1, V2, • • • , Vn ), w = (wu W2, • • • , W n ) 
are defined to be orthogonal if their inner product 

0‘W = + V 2 W 2 + * • * + 

is equal to zero. (See §§6, 7.) This definition is purely 
formal, but it includes an important special case. If S is 
an w-dimensional euclidean space, and if the components 
of the vectors are relative to n mutually perpendicular 
unit vectors ci, €2, • • • , e n , each of unit length, then 
0 w = O means that 0 and co are perpendiculai to each 
other. 

Let us investigate the possibility that a vector 0 be 
orthogonal to itself. This would entail 

v\ + vl + • • • + 1>n = 0 

and is a phenomenon which can easily occur in a field 
such as the complex field where the vector (1, *), for in- 
stance, is orthogonal to itself.* In order to avoid this 
embarrassing situation, we must assume that F is a 
formally real field — that is, that every relation 

v\ + vl + • • • + v 2 n = 0 

where the v’s are in the field implies that 01=1*2= • • • 
=0„ = O. If F is formally real, the only vector which is 
orthogonal to itself is the zero vector. 

* In complex geometry the orthogonality condition is usually 
modified to be so that only the zero vector is orthogonal to 

itself. 
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Theorem 80 . Let F be formally real , and let ft, 
ft, • • • , ft be any k vectors which span the space Sk of 
dimension k. For any value of i , 1 <i£ k, it is possible to 
find a vector ft' such that 

ft. * » ft— 1» fi % » ft+i» * * • » ft 

span Sk, and such that 

ft ft - 0 i = 1, 2, • • • , i - 1. 

Let us set 

Pi = E w* 

h-l 

where the coefficients / t , /,• are to be determined 

so that 


= E = 0 / = 1 , 2, • • • , i - 1 . 

A-l 

The inner product ft -ft is in F so that the above condi- 
tion is actually a system of t — 1 linear homogeneous 
equations in the i unknowns h, l 2f • • • , /„ which by 
Corollary 4 has a solution not composed entirely of 0’s. 
In order to show that the linear systems of vectors 

ft» * * • . ft» • • • * ft. ft. * * * i jft * • • • , ft 

are linearly equivalent, it is sufficient to show that ft can 
be written as a linear combination of the vectors 

ft. • • • . ft-i, fit • 

That is, it is sufficient to show that /.s^O. But if /»• were 
zero, we should have 

fit = Zift + /a ft + • • • + /*-ift-i. 
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Since j ft' is orthogonal to each of these vectors, it is 
orthogonal to every linear combination of them, and 
hence to itself. It must therefore be the zero vector, since 
F is formally real, and hence 

llfil + 1 2 ft + • • • + = 0. 

But the vectors ft, ft, • • • , ft are linearly independent 
since they span ft of dimension fe, so that we should 
have 

/i = / 2 = • • • = / ,_i = l t = 0, 

which is a contradiction. Thus /»?*0. 

A vector space ft of dimension k is said to possess an 
orthogonal basis 

ft, ft, • • * , ft 

» 

if the fts span 5 a, and if 

ft ft = o * * y. 

Theorem 81. // F is formally real , and 5 ji- 0 / 
dimension k is a sub space of the total vector space 5 of 
dimension n t an orthogonal basis 

fill fi‘2t * ‘ ’ » ft, * * * » fin 

can be found for S such that 

ft, ft, * ' * , ft, 

« an orthogonal basis for 5a. 

Let 7 i, 72 , • • ■ , 7 * be a basis for 5 such that 7i» 
72 , - • • , 7 a, is a basis for 5/ k . Take ft' = 71 . Then by 
means of Theorem 80 replace 72 by ft' such that 
fix -fit = 0 . Then replace 73 by ft' such that 

ft' ft' = ft' ft' = 0 . 
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Continue in this way until 

Pi f.fc', ■ ■ • .PI 

is an orthogonal basis for 5*. Then continue in the same 
manner until an orthogonal basis 

Pi :,«.••• ,p: 

for S is obtained. 

A vector = (v\, *> 2 , • • • i v n ) is said to be normalized if 
+ vt + • • • + vl = 1 . 

Every vector # 5^0 possesses a normalizing factor k, such 
that k<t> is normalized, if \Vf+z»ij + • • • Is in the 
field Fand is not 0. That Fbe formally real is necessary 
but not sufficient. Instead of attempting to delineate the 
most general type of field in which this condition is 
satisfied, we shall in the remainder of this chapter as- 
sume that F is the real field R . 

Corollary 81. If Sk of dunension k is a subspace of 
the total vector space S of dimension n over the real field , a 
normalized orthogonal basis 

Ply Pi, * * ' » Pky * * * » fin 

can be found for S such that 

Ply P'2 » • * ' » Pk 

is a normalized orthogonal basis for 5*. 

For if Pi, p it • • • , p n is an orthogonal basis, kiPi, 
foPi, • • • , k n Pn is a normalized orthogonal basis where 

hi = 1/ y/b\ + bh + • • ■ + b\, P, = (b t i, bn, • • • , 6» n ). 

We now see that the row vectors (column vectors) of 
an orthogonal matrix form a set of normalized orthog- 
onal vectors. 
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59. Symmetric matrices. A matrix is said to be sym- 
metric if it is equal to its transpose. Thus A=(a ra ) is 
symmetric if a rj =a, r — e.g., 

2 3 - l-i 

3 3 1 . 

- 1 1 - 2 J 

If A =A T , then A i =(A T ) i =(A') T so that A' is also 
symmetric. Clearly the identity matrix I is symmetric, 
and kA is symmetric for every number k of the field. If 
A and B are symmetric, 

(A + B) T = A T + B T = A + B, 

so that the sum of two symmetric matrices is symmetric. 
Thus if A is symmetric, so is/(i4) where /(x) is any poly- 
nomial. In particular the inverse of a non-singular sym- 
metric matrix is symmetric. 

Two matrices A and B are said to be orthogonally 
similar if there exists an orthogonal matrix P such that 

A = P T BP. 

We may then write A=B. It may readily be verified 
that orthogonal similarity is reflexive, symmetric and 
transitive. (See §41.) If B is symmetric and A =B, then 
A is also symmetric. For if A =P T BP, then 

A T = P T B T P = P T BP = A. 

The theory of similarity is seriously handicapped 
when the transforming matrix is restricted to be orthog- 
onal. To obtain any worth-while results, the trans- 
formed matrix must also be restricted. When the latter 
is taken to be symmetric, results of great geometrical 
importance can be obtained. 
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Theorem 82 . Let A be a symmetric* matrix with ele- 
ments in any field F , such that the minimum function pos- 
sesses two distinct linear factors h(x) and h(x) with coeffi- 
cients in F. Let 0i be a vector of the null space of li(A), 
and let 02 be a vector of the null space of 1%(A). Then 0i 
and 02 are orthogonal . 

Let 

h(x) = xi — *, l 2 (x) = * 2 — *, Xi9* x 2 . 

By Theorem 44 , h(A) is singular so that there exists a 
non-zero vector 0i such that 


/i(yl)-0i = 0, 40i = xi<h. 

Similarly there exists a non-zero vector 0* such that 
<402 =*202. Hence 

02^401 ss 02*101 = *102 ‘01* 


If A=A T , 


02-401 = ( 024 r )01 = * 202 ' 01 . 


That is, 


(*1 — *2)02'01 = 0. 


If *1^*2, 02*0i = O so that 0i and 0a are orthogonal vec- 
tors. 

A vector 0i such that < 40 i = *i0i where *i is a char- 
acteristic root of A is commonly called a pole or char- 
acteristic vector of A . 

Corollary 82 . The characteristic roots of a real sym- 
metric matrix are all real . 

If A is real, its characteristic equation has real co- 
efficients. The minimum function is a divisor of the char- 
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acteristic function in the domain !?[*], so that the 
minimum equation of A also has real coefficients. The 
non-real roots of such an equation occur in conjugate 
pairs. 

Suppose that Xi is a non-real characteristic root of 
the real symmetric matrix A , and let 0 1 be a non-zero 
vector of the null space of xJ—A. Then 

A<f>i = Xi<t>i 


and, since A is real, 

yl0i = X i0i 

where xi is the complex conjugate of Xi. That is, 0i is a 
vector of the null spaed of xJ—A. Since x\ is not real, 
X\ 7 *X\, so that 

01*01 = 0 

by Theorem 82. If 0i = (i>i, v 2t • • • , v n ), then 

01*01 = V\Vi + V 2 V2 + * ' * + VnVn = 0. 

This implies that 

Vl = V 2 = • • • = v n = 0, 

which is a contradiction, since we assumed 015^0. Thus 
no non-real characteristic root of A can exist. 

Theorem 83. The roots of the minimum equation of a 
real symmetric matrix are distinct . 

Suppose that the real symmetric matrix A has the 
minimum function 

m(x) * (xi — x) k h(x) 


k > 1. 
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Define nti(x) as follows 1 : 

mi(x) = (*i — h(x). 

Then m(x) is a divisor of [»h(*)] J so that 

[mi(/l)] J = 0. 

The matrix 

mi(A) = A i - (a„) 


is symmetric, since it is a polynomial in a symmetric 
matrix. Also A \ = 0. Hence 

w n 

drxdis = 22 drxd$i = 0 

•-1 *-l 

and in particular 

w 2 

E fl r »’ = 0 r = 1, 2, • • • , n. 

»-i 

Since the field is real, every element a rt = 0 so that 
ili = 0. That is, m\(A) =0. But m\(x) is of lower degree 
than the minimum function m(x ), so that the hypothe- 
sis k> 1 is untenable, and x\ is a simple root of m(x) =0. 

Corollary 83. All the elementary divisors of a real 
symmetric matrix are linear . 

This is immediate, since every elementary divisor 
divides m(x). 

60. The orthogonal canonical form. 

Theorem 84. Let A be a real symmetric matrix. 
There exists an orthogonal matrix P such that 
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Xi 


Xi 


*2 


P T AP = 


X 2 


Xk 


L 

where the x's are the characteristic roots of A . 

By Theorem 83 and Corollary 82, the minimum 
function of A has the form 

m(x) - (*i — x)(x t — *)•••(**—*) 

where ac lf x», • • • , ** are real and distinct. Let S{ , the 
null space of xJ—A, be of dimension r { . By Corollary 
81 we can determine r,- linearly independent mutually 
orthogonal normalized vectors 


<M, <7,8, ' * • , <7<r< 

which span 5/ . 

Form the matrix 

P = [<7ll, • • • , <7 If,, <781, • ' • , ?8r„ * * * » < r *l, * * * , <7*r*]. 

By Theorem 55 the column vectors of P span the total 
space S, so that P is non-singular. By Theorem 82 each 
vector of the space 5/ is orthogonal to every vector of 
5/ for jV*. Consequently P is an orthogonal matrix. 

Since every vector of 5/ is annihilated by x,I—A, it 
follows that 
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A<rn = (TuXu • • • » Act i r , = o\r x Xu • • • » 

* * * i 

These equations can all be combined into the one matric 
equation 

,4P = PB , P r ,4P = P 

where 


r*i 




**- 

where is repeated r,* times. 

Since i4 and B are similar, they have the same char- 
acteristic roots, which are obviously the diagonal ele- 
ments of P. 

It should be noted that if the characteristic roots of 
A are all distinct, and if their order is specified, the 
orthogonal matrix P is unique except that the signs of 
the elements of any column may be reversed. In other 
cases it is not unique. Later we shall see the geometric 
consequences of this fact. 

61. Principal axis transformation. Let Fbe any field. 
The polynomial 
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/(* l, *»,•••,*,)= ]£ 

*,y—l 

with coefficients in F is known as a quadratic form of 
matrix A = (o„). It is obviously no restriction to assume 
that an*a{i, so that every quadratic form has a sym- 
metric matrix. 

Let us introduce new variables by means of the linear 
homogeneous transformation 


n 

Xi = 2 Pa*i * = L 2 > • • * » *• 

i-i 

and call P = (p„) the matrix of the transformation. 
Clearly /is transformed into a new quadratic form 


That is, 


9 fl 

/= L &.»*' */ • 

i,/-i 

/ = ^ OijpikpjlXk *1 • 


Upon comparing coefficients we see that 

6r. = 2 ViiPiTpU 


which in matric notation is 

B = P T AP. 

Theorem 85. ///(* i, *»,•••. *») « « reo ^ quadratic 
form, we can determine an orthogonal transformation on the 
x's which reduces f to the form 


hi*'* + + • • • 4 - h n x» 

where hi, *»,•••,*„ are the characteristic roots of the 
( symmetric ) matrix off. 
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This follows directly from Theorem 84. 

Let us apply this theorem to a problem in analytic 
geometry. In ordinary euclidean 3-space the equation 

2x 2 ■+■ 2 y 2 — 4z 2 — 2 yz — 2xz — 5 xy — 2x — 2y + z = 0 

represents a quadric surface.. We wish to find an orthog- 
onal transformation which will eliminate the terms in 
xy, xz and yz . Since the second degree terms transform 
into second degree terms, and the linear terms into linear 
terms, we may for the moment ignore the latter. The 
form 

r, y, 2 ) = 2x 2 + 2 y 2 — 4;c 2 — 2 yz — 2 xz — 5 xy 
has the symmetric matrix 



whose characteristic equation is 

s 3 - S£ X = 0. 

Thus the characteristic roots are 

= h ~ ki = 0 . 

The null space S{ of \I—A is spanned by the 
normalized vector 

+1 - (1/V2, - 1/ y/2, 0), 

and similarly the null space S{ of —%I—A is spanned 
by 


* 2 = (l/3>/2, l/3v/2, 4/3V2) 
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and the null space S{ of —A is spanned by 


< f > 3 = (31 "3» §)• 

The matrix 

"1 1 2 ~ 

T 

-11 2 

P = =r = — 

V2 W2 3 

4 1 

0 = 

3V2 3 J 


is orthogonal, and % by means of the orthogonal trans- 
formation having P as matrix, / is reduced to the form 

- f /\ 

and the equation of the quadric is reduced to a form in 
which the cross-product terms are absent. 

If two of the characteristic roots had proved to be 
equal, the orthogonal matrix would not have been 
uniquely determined. This would mean that the quadric 
was a surface of revolution. If all three roots are equal, 
it is a sphere. 



CHAPTER IX 
ENDOMORPHISMS 

62. Groups with operators. In this concluding chap- 
ter we shall treat vectors and matrices from a more 
abstract point of view and attempt to give the reader 
an insight into what is at the moment the popular mode 
of approach to matric theory. 

The simplest important mathematical system is the 
group . A group consists of elements and a well-defined 
binary operation with respect to which the system is 
closed. That is, every ordered pair a , P of equal or dis- 
tinct elements of the system determine a unique element 
7 of the system. We shall choose to denote the symbol 
of the operation by + and to write 

a + P = y. 

In order that the system shall be a group it is necessary 
that the group operation be associative: 

(a + P) + 7 — a + (0 + 7). 

Thirdly, there must exist an identity element 0 such that 

a -fO = 0 + a = a 

for every group element a . And lastly, corresponding to 
each element a there must be an inverse ~a such that 

“a + a = a + "a «= 0 . 

If the group has the further property that for every 
pair of elements a and P 

a + P = P + a, 

then the group is said to be commutative or abelian. 
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In every commutative group G one may set up cor- 
respondences such as 

(42) a — ► a\ • • • 

by which every element of G determines a unique ele- 
ment of G. These primed elements may or may not con- 
stitute the whole of G . Such a correspondence will be 
called an endomorphism of G if for every two elements 
a and /3 of G it is true that 

a + jS -> a' + 0'. 

We may think of every endomorphism as being ac- 
complished by an operator. We define k to be the oper- 
ator such that 

ka = a', %0 = 0', • • • . 

Since the correspondence is an endomorphism, £ is a 
distributive operator: 

k(a + 0) = «' + 0' = ka + *0. 

Every commutative group has at least two opera- 
tors, the zero operator 0 such that* 

Oa = 0 

for every element a of G, and the unit operator 1 such 
that 

la = a 

for every element a of G. For every operator k , ft0 = 0, 
for 

ka = k(a + 0) = ka + kO. 

* The context will indicate whether 0 is a group element or an 
operator and will make unnecessary the use of more elaborate nota- 
tion. 
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Let k and l be two equal or distinct operators, and 
consider the correspondence 

a — * ka + la. 

This correspondence is an endomorphism, for 

a + fi-+ k(a + fi) + l(a + fi) - (ka + kfi) + (let + Ifi) 

= (ka + la) + (kfi + Ip). 

This endomorphism we shall call the sum of the endo- 
morphisms k and /, and denote by the symbol k+l. 
That is, 

(k + l)a = ka + la. 

If 0 is the zero operator, 

(k + 0)a = ka + Oa = ka 

for every element a of G. Since fe+0 and k have the same 
effect upon every a , these operations arc equal. Thus 0 
is an identity element in the addition of operators (or 
endomorph isms) . 

Since la , Ifi, • - • are elements of G, so are 
k(la) t k(W ), • • • 

for all operators k and /. The correspondence 
et * k(la ), 1 8 -> k(ip) f • • • 
is an endomorphism, for 

« + fi -+ k(l(a + fi)) = k(la + Ifi) = k(la) + k(lfi). 

This endomorphism we shall denote by the symbol kl 9 
and call the product of the endomorph isms k and /. That 
is, 


(kl)a = k(la). 
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If 1 is the unit operator, 

(&l)a = k{\a) = ka, ( \k)a = \{ka) = ka 

for every group element a. Since kl and lk produce the 
same effect as k on every element a, it is true that 

k\ = \k = k. 

Thus 1 is the unit element in the multiplication of endo- 
morph isms. 

Let us now consider the set Q of all endomorphisms 
of a commutative group G. These endomorphisms, or 
operators, constitute the operator domain of G . We shall 
prove 

Theorem 86. The operator domain Q of a commuta- 
tive group is a ring with unit element. 

To be a ring with unit element, the domain fl must 
have the following properties: 

1. The domain is a commutative group with respect 
to addition. 

2. The domain is closed with respect to multiplica- 
tion. Multiplication is associative, and there is a unit 
element. 

3. Multiplication is distributive with respect to ad- 
dition. 

To establish the first property, we note that the sum 
of two operators is an operator, and that addition of 
operators is associative, since 

\k + (/ + tri) ]a = ka + (/ -f* ni)a = ka -J- {la -f- tna) 

= ( ka + la) + tna = {k + l)a + tna 
= [{k + /) + m]a 

for every element a of G. The unit of addition, as we 
have seen, is 0. For every element a of G, 
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ka + k(~a) = k(a + —«) = £0 = 0 

so that k(~a) is an inverse ~ka of ka . Finally, addition of 
operators is commutative if G is commutative. 

We have seen that 12 is closed with respect to multi- 
plication, and that 1 is the unit operator. Multiplication 
is associative, since 

[k(ltn)]a = k[(ltn)a\ = k[l(ma)] = ( kl)(ma ) = [(kl)m]a 

for every element a of G. 

To prove the first distributive law, we note that 

[£(/ + tn)]a = k[(l + m)a] = k[la + ma] 

= k(la) + k(ma) = ( kl)a + ( km)a 
= [kl + km\oL 

for every element a of G. That is, 

k(l + m) = kl + km. 

Similarly we may show that 

(k + l)m = km + 

63. Vector fields. A ring with unit element falls short 
of being afield unless multiplication is commutative and 
every element except the identity 0 of addition has an 
inverse with respect to multiplication. 

As we have seen, the total operator domain 12 of a 
commutative group G is a ring with unit element. If Q 
contains a proper subring Sl x with unit element, then S2j 
is an operator domain of G properly contained in the 
total operator domain. 

A commutative group with an operator domain fli 
(whether the total operator domain or not) which is a 
field is called a vector space . 
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If Hi is a field, every operator k^O has an inverse 
k~ l such that k- l k — kk~ l is the unit operator 1. If, then, 
ka = 0 with ky* 0, 

a = la = (k~ l k)a = k~~ 1 (ka) = k~ l 0 = 0. 

The contrapositivc of this statement is that, if ka*=0 
and a y£ 0, then k = (). 

The significance of the assumption that F is a field 
may now be seen upon referring back to ( 42 ) where it 
was noted that the primed elements might not consti- 
tute the whole of G. 

Theorem 87 . If k is any operator of not the 0- 
o per at or, then 

hot , kf$, 9 k* y, 

are all distinct and constitute the whole of G. That is, k 
merely brings about a permutation of the elements of G . 

Let G be the group of elements a, ft, y, • • • , and let 
G' be that subset of G which is composed of the elements 
ka, k{ 8, ky, • • • . Since k^0, there is an operator of 
Hi such that kk~ l = 1 . Let yj be any element of G. Then 
k~ l rj is an element of G such that 

Hk-'ri) - (kk-')ri = lr| = 77, 
so that 17 is in G ; . 

If kj*0, the elements ka , k/3, ky , • • • are all dis- 
tinct, for ka=*k(i would imply 

ka - kfi = k(a - 0) = 0 

and consequently a— ^8 = 0. 

We shall call a vector space S n a finite vector space of 
dimension n if it contains n linearly independent ele- 
ments while every set of n + 1 elements arc linearly de- 
pendent. . 
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Let S n be a finite vector space of dimension n t and 
let 

«1» € 2 , * ' * , € n 

be n linearly independent elements of S„. If a is any 
element of S n , there exist n + 1 numbers bu £> 2 , • • • , £> n+ i 
of F not all 0 such that 

b\t\ + £>2*2 + • • * + b n Cn + £>»+ 1<* = 0. 

If 6 n + 1 were 0, at least one of the other b' s would be 
different from 0, and this would imply a linear depen- 
dence relation among the e’s, which is not possible. Thus 
b n + 17*0 and we may write 

a = di€i + 0,2*2 + • * ■ + fl n Cn, Qi = ~ £>»/£>,. + 1 » 


or 


a = (a lf «2» • • • i a n ) 


*1 

€2 


This representation of a is unique, for if a had two such 
representations in terms of the e’s, their difference 
would yield a linear relation among the e’s. 

Consider the correspondence 

a <-> (a i, fl2, • • • » 0«) 

where (ai, a 2 , • • • , a„) is an w-tuple of numbers of the 
field F. This correspondence is biunique. Let addition 
and scalar multiplication of n-tuples be defined as in §7. 
Then, if 

P = b\€\ + 6 2 *2 + • • • + b n € n , p (6i, 62, • • • , b n ) 

is another element of the vector space, 
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a + fi <-» (ai + b\, a 2 + • • • , a n + b n ) 

= (fli, fl2, • ' • , 0„) + (&i, &2» * ' * , 6n) 


and 


ia <-> ( ka \ , £<*2, • • • , £a») = «2, • • • » fl»). 


Consequently the correspondence is an isomorphism, and 
we have 

Theorem 88. Every finite vector space of dimension n 
is isomorphic with a vector space of n-tuples of numbers of 
F as defined in § 7 . 

The representation of the vector space in terms of 
w-tuples of numbers of F is dependent upon the n line- 
arly independent elements ci, €2, • • • , e n which were 
chosen for the basis. It is clear from the theory of linear 
dependence that every basis is composed of the same 
number n of elements so that the concept of dimension is 
well defined. 

Two isomorphic mathematical systems arc ab- 
stractly identical as far as their properties with respect 
to the operations under consideration are concerned. 

It Is worthy of mention that for w = 1 this vector 
space becomes the field F. 

There are vectoi spaces other than those composed 
of n-tuplcs of numbers of a field. For instance, consider 
the set of all polynomials in x with coefficients in an 
infinite field F. We may think of each polynomial as an 
w-tuple, where n is infinite, composed of all values which 
it assumes when x is replaced by the elements of F, or 
we may think of the polynomial as a vector merely be- 
cause the postulates for a vector space are satisfied. 
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64. Matrices. Consider the total operator domain 8 
of the commutative group G, and let F be a subdomain 
of Q which is a field contained in the central of Q. If a 
is any operator of Q defining the endomorphism 

a — > a', 0 — > 0', • • • 

and if k is any operator of F } then because F is in the 
central of Q, 

a(fea) = £(aa) = ka f 

so that 

ka — > ka'. 

We may properly call a an endomorphism of the vector 
space S defined by G and F. 

We shall confine our attention to a finite vector 
space S n of dimension n composed of all w-tuples of num- 
bers of F . This vector space has a basis composed of n 
linearly independent vectors 

«i» € 2 , * * * » € »- 

Evidently the effect of an endomorphism on every ele- 
ment of S n is known when its effect on the basic vectors 
is known. Let a be an endomorphism of S n under which 

€i —> €i » €» — > €s y • " ‘ , Cn — > €n . 

Since each vector «/ is uniquely expressible in the form 

€» = OtjCi + m ai2€2 + • • • + Oinfn 

where the a* s are in F, we may write 


€]' 


<Zii a \2 • • • a ln ~ 


€i 


= 

a2i 022 • • • a2n 


€2 

- €n - 


a n 2 * * * a n n„. 


- €n - 
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In matric notation this is 

£' = AE 


Hence 


aE = 


a€i 

Q € 2 


= E f = yi£. 


i4 = (O- 


L ae n J 

Thus every endomorphism a determines a unique matrix 
A with elements in F. This correspondence is actually 
biunique, for the €*' are uniquely determined by the 
matrix A. We shall write 

a <-» A. 


In the endomorphism ring of S n we have defined two 
operations, addition and multiplication. In the matric 
ring of order n over F we also have two operations de- 
noted by addition and multiplication. A biunique cor- 
respondence 

a «-» A, b «->£,•• • 

is an isomorphism with respect to addition if 
a + b «-* A + /i, 

and is an isomorphism with respect to multiplication if* 
ab^BA. 


The endomorphism ring will be isomorphic with the 
matric ring if it is isomorphic with it with respect to 
both addition and multiplication. 

* The order in which the product is written is a purely notational 
matter. 
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Theorem 89. The endomorphism ring of the vector 
space S n of dimension n over the field F is isomorphic with 
the ring of all matrices of order n over F. 

Let a and 6 be two endomorphisms, and let 

Then a+b is the endomorphism which carries each vec- 
tor €« into the vector etc* + be*. But 

(a + b)E = clE + bE = AE + BE - (A + B)E 

so that 


a + b «-> A + B. 

Consider the matrix whose row vectors are the vec- 
tors 


0 » = Vi\€\ + Vi & 2 + • • * + Vi n e n . 

Clearly 


01 

02 

= 

“Dll S 12 • • • V in ~ 

»21 t’22 ' ' ’ Vin 


«1 

ii 

- 0» - 


-Vnl Vn2' • ' Ann.. 


- _ 



From the definition of endomorphism we have 

a<t>i = Viitei + Vitfi€2 + • • • + Vin&€n 

so that 


0 7£= 

~ d01 

a02 


1»21 

Vl2 * * ‘ 

t»22 * * * 1>2 n 


— G0» — 



Vn2 • * * Vnn. 


d€i ~1 


ac 2 


= F(a£). 


L <l€n J 
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Now we may complete the proof of Theorem 89. 
Evidently 

(ab)E = a(bE) = aBE = B(aE) = B(AE) = (BA)E. 
Hence 

ab «-* 2L4. 


65. Change of basis. An endomorphism of a vector 
space, as we have defined it, is an absolute concept and 
does not involve the concept of basis. The matric rep- 
resentation of the endomorphism, however, appears to 
depend upon the basis chosen. When we pass from one 
basis to another we should therefore expect the total 
matric ring to undergo an automorphism — that is, an 
isomorphism with itself. We shall now see how this takes 
place. 

If €i, €j, • • • , € n and tji, qt, • • • , r) n are two bases of 
S nt they are related in the manner 

Vi = S T = (/ rf ) 

where the elements are in F and T is a non-singular 
matrix. These equations may be written 


H = TE } H = 


Vi 

V2 


L *?n J 

If a is an endomorphism, let 

a£ = AE, a H = A'H 


so that, relative to the €-basis a*-*A and relative to the 
17-basis Then 

aH = aTE = TaE = 
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Also 

off = A'H = A'TE. 

Since the e’s are linearly independent, A'T = TA, or 
A' = TAT-K 


We have proved 

Theorem 90. Under a change of basis of matrix T, 
the matrix A*-hx is transformed into the similar matrix 
TAT-K 

The correspondence 

A «-» TAT~ l , B «-» TBT~ l , • • • 

is clearly an automorphism of the total matric ring. For 
it is biunique, and 

TAT~ l + TBT~ l = T(A + B)T~\ 
TAT-'TBT - 1 = T{AB)T~ l 

so that 

A + B ~ T(A + B)T~\ AB *-» T(AB)T~ l . 

An automorphism obtained by multiplying all the 
matrices of the ring on the left by the same non-singular 
matrix T and on the right by T~ l is called an inner auto- 
morphism of the ring. 

If €j, < 2 , * * * , «» is any basis for S», and if T is any 
non-singular matrix with elements in F, the vectors 

Vi = X UAi T ** (t n ) 

constitute a basis for S„. Thus if a is an automorphism of 
matrix A , and if A' is any matrix similar to A , a basis for 
5* may be chosen with respect to which a*-*A'. Now 
from Theorem 68 we have 
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Theorem 91. A basis for the vector space S n may be 
so chosen that any particular endomorphism has a matrix 
which is a direct sum of matrices of Type (35). 

The abstract approach to matric theory by way of 
vector spaces has certain advantages of elegance over 
the classical approach. Thus there is no question as to 
how the sum, product and scalar product of matrices 
must be defined if the correspondence between endo- 
morphisms and matrices is to be an isomorphism. Once 
this isomorphism is established, the associative and 
distributive laws, etc., which are so obvious for endo- 
morphisms, are necessarily effective for matrices. The 
necessity and reasonableness of these rules of combina- 
tion are more directly apparent. 
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PROBLEMS 


Chapter I 1-2 

1. Solve by putting the system into triangular form : 
*+y+«=8, *+y+«— 12, x+z+w=<14, y+z+«=14 

2. Obtain the general solution of 

x + y — z — 3, x — 3y + 2z = 1, 2x — 2y + z = 4. 

Chapter II 3-34 

3. Find a basis for the linear system spanned by 
(11, 8, -2, 3), (2, 3, -1, 2), (7, -1, 1, -3), 
(4, -11, 5, -12). 

4. Show that (13, 11, —3, S) is a vector of the linear 
system of Ex. 3. 

5. Let 5 be spanned by 

*i - (11, 2, 7, 4), 0 2 = (8, 3, -1, -11), 

** = (-2, -1, 1, 5), * 4 = (3, 2, -3, -12) 

and define 

- (3, -1, 8, 15), 0 2 « (21, 6, 7, -12). 

Show that 0i and 02 are in 5 and are 1 nearly independ- 
ent, and that 0 l9 0 2 , 0 8 , 04 span S. 

6. Kronecker’s delta is a function such that 
^ ,ss 1 ^ t and — 0 if i^j. Find a function f(i 9 j) which 

(a) equals a when i = 2 and equals b when i 3 ^ 2 ; 

(b) equals a when i«2 and j aB 3, equals b other- 
wise; 

(c) equals a when i =* 2 or j 3, equals 6 otherwise. 
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7. Let € t j be the matrix (8 r »8j«). Prove that 


*t]€kl = 8jfc€ t|. 


8. Prove that annbyn matrix A which is commuta- 
tive with every n by n matrix X is scalar. 

Hint: Let = 

9. Define tt = (S r +i.«) where w is of order n. Show 
that w 2 = (8 r + 2 .t) and that w n = 0. 

10. If 

r 3 ■ 4 -, 

i* - -1 0 2 , 

L 2 3 J 

find A~\ such that i4~L4 =/. 

11. Show that the elementary transformations of 
Type I can be obtained as successions of elementary 
transformations of Types II and III. 

12. Reduce to Hermite form the matrix 

“ 10 2 -2 2 “ 

7 2-11 

-4 4 4 -4 

- 9 3-1 1 _ 


13. Find a sequence of elementary matrixes Eu 
E 2t • • • , Ek such that 


JE* • • • E%E\A = /, 



14. Consider the rectangular array 



PROBLEMS 


195 


pi 0 0 4 -1 

7/1= 0 1 0 1 3 

Lo 0 1 2 0 

Apply to this array the row operations of Ex. 13 and 
note that the result is A~'I. Prove that the method is 
general. 

15. Find the most general matrix P such that PH 

=H, 

"0 0 0 0 " 

2 10 0 

H = 

0 0 0 0 

_3 0-1 1_ 

16. Outline a method for determining the most gen- 

eral non-singular matrix P such that PA =A, where A 
is any matrix. n 

17. Let f(x, y) be the bilinear form 

whose matrix is A = (a rt ). Let f(x, y ) become /(*', y') 

n 

= of matrix A' = (a' r ») under the trans- 

formations 

w n 

Xi = J2 pxkXk r y t = 5Z <hiy’i (i, j - 1, • • • , n) 

k-l l~l 

of respective matrices P = (/>«), (?=(£••«). Show that 
A' = P T AQ. 



18. Show that every bilinear form/(x, y) can, by a 
proper choice of non-singular linear transformations, be 
brought to the form 

x( yi + x{yi + • • • + xl y’ r 


r £ n. 
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19. Illustrate the theorem of Ex. 18 by using the 
matrix of Ex. 12. 

20. If A is of rank r, and there exists a mat- 

rix 5 of rank 5 such that SA =A. 

21. If PA = 23 , prove that the row space of B is a 
subspace of the row space of A . 

22. If the row space of B is a subspace of the row 
space of A, show that there exists a matrix P such that 
PA = 23 . 

23. If PA=B and QB—A, prove that there exist 
non-singular matrices Pi and Q\ such that P\A = 23 and 
QiB=A. 

24. If PAQ — B and RBS — A y show that there exist 

non-singular matrices Pi, Q u Ri, Si such that PiAQi — B , 
RiBSi^A. § 

25. Show that the sum and product of triangular 
matrices are triangular, the diagonal elements of the 
sum (product) being the sum (product) of the corre- 
sponding diagonal elements of the summands (factors). 

26. Show that by a modification of the reduction, 
the Hermite form of a matrix under column transforma- 
tions can be taken in triangular form, the (Vs above the 
diagonal. 

27. Prove Sylvester’s Law of Nullity: The nullity 
of a product of two matrices is greater than or equal to 
the nullity of either factor, and less than or equal to the 
sum of their nullities. 

28. Prove that every matrix is uniquely expressible 
as the sum of a symmetric matrix and a skew matrix. 

29. If A is a skew matrix of odd order, show that 
d(A)=0. 

30. Prove that the rank of every skew matrix is 
even. 

31. If A is a skew matrix of even order, d(A) is 
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expressible as the square of a rational function of the 
elements of A . Prove for n = 4. 

32. Determine the rank of 


“0 1 

1 0 

b c 


b d “ 

c e 

2 be cd + be 


Id 


e cd + be 2 de J 


33. If a matrix has a unique left inverse, show that 
this inverse is also a right inverse. 

Hint: Consider X* = XX- l -I+X~K 

34. Given 



‘ 3 

2 

-r 


--3 

-2 

r 

A = 

-2 

5 

3 

, B = 

7 

11 

0 


_ 1 

7 

2 . 


_ 4 

9 

1_ 


Find a non-singular matrix P such that A = PB . Find a 
matrix Pi of rank 2 such that A = P\B. 


Chapter IV 35-52 

35. Given the matric polynomials 


/(*) = 


+ 


*(*) = 


c a 

C 3 

* 4 + 

> 

GJ 

+ 

c a 

G:a 

** + 

g a 

* — 

n 281 

L6 26 J 


t 


(a) Divide f(x) by g(x) by long division and find 
q(x) and r(x). 

(b) Show that gi(c) =0 when 
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(c) Find f L (c) and r L (c). 

36. Let f(x) be given as in Ex. 35, and let 


— ctm- 

(a) Find q(x)> r , qi(x) and r i. 

(b) Show that / l(c ) =r and fn(c) =ri where 


37. Given 



Find the characteristic equation of A and show that this 
equation is satisfied by A. 

38. Determine the minimum function of A of Ex. 37. 

39. Let F(x)=x 2 — 9. If d(x) is a greatest common 
divisor of F(x) and m(x) as determined in Ex. 38, show 
that the rank of F(A) is equal to the rank of d(A). 

40. Find the determinant and the norm of A of Ex. 
37. 

41. Express A* 1 as a polynomial in A. (Ex. 37). 

42. If F(x) =jc 2 +jc — 2, express the inverse of F(A) 
as a polynomial in A. 

43. By means of the Euclid algorithm, find a great- 
est common divisor d(x) of 

f(x) = 3* 4 + x z + 5x 2 + 2x + 4, 
g(x) = 2* 4 + * 8 - x 2 - 3x - 2. 
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Find two polynomials p{x) and q(x) such that 

d(x) = p{x)f(x) + q(x)g(x). 

44. Show that if f(x) and g(x) are relatively prime 
polynomials, there exist polynomials p{x) and q(x) such 
that 


P(x) f(x) + q(x)g(x) = 1, 

and conversely if such polynomials exist, f(x) and g(x) 
are relatively prime. 

45. Show that if f(x) and g(x) are relatively prime, 
and if f(x) divides gOc) ■ h(x), then f(x) divides h(x). 

46. Show that if f(x) and g(x) are relatively prime 
polynomials, f(x) of degree m and g(x) of degree w, 
there exists a unique pair of polynomials, p(x) of degree 
<n and q(x) of degree <m, such that p(x) f(x)+q(x) 

■gW = h 

47. Prove a theorem analogous to Ex. 46 in the case 
where f{x) and g(x) are not relatively prime. 

48. If the field F * contains the field F , and if d= (a, b) 
in F[x], then d=(a,b) in F*[x]. 

49. I -et A have the minimum function m(x), and 
let B=f(A ) have the minimum function M{x). If a is 
a root of m(x) =0, show that f(a) is a root of M(x) =0. 

Hint: m{x) \ M(f(x)). 

50. Let b be a root of M{x) =0 in Ex. 49. Show that 
there exists a root a of m(x) =0 such that b 

Hint: f(a) —6/ is singular. 

51. Find an equation whose roots are the squares of 
the roots of x 2 +x — 3 = 0. 

52. Find an equation each of whose roots is a value 
of the function f(u) =u 2 +u + 1 where u ranges over the 
roots of the equation x z +x+l =0. 



200 


PROBLEMS 


Chapter V 53-71 

53. If S has the basic vectors (1, 0, 0, G, 0), (3, 2, 1, 
0, 0) and (2, —3, 1, 5, 1), find a basis for the orthogonal 
complement S'. 

54. Find the orthogonal complement of 

4 8 18 7“ 

0 4 10 1 

10 18 40 17 
.1 7 17 3. 

55. Let F be the complex field and let Si be spanned 
by the vector (i, 1, 0). Find S{ and show that SiCS/. 

56. Find a greatest common right divisor of 


' 2 

1 

4 

2' 


--1 

3 

4 

o- 

-1 

1 

5 

3 


0 

-2 

1 

3 

3 

3 

13 

7 


2 

4 

18 

10 

_ 1 

2 

9 

5_ 


. 0 

0 

0 

0. 


57. Prove that if F is formally real and if Si is any 
vector space with elements in F, then the union of Si 
and Si spans the entire space. 

58. Let Si have the basic vectors (0, 1, 3, 2), 
(0, 0, 1, 5), and let S 2 have the basic vectors (1, 2, 0, 3), 
(0, 2, 6, 4). Use these to illustrate Theorem 51. 

59. Find a g.c.r.d. D of matrices A and B t and find 
also matrices P and Q such that D*=PA +@5, where 


A - 

-oooo- 

0 13 2 

0 0 15 

, B = 

-1 2 0 3- 

0 2 6 4 

0 0 0 0 


.0 0 0 0. 


.0 0 0 0. 
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60. Illustrate Theorem 53 with the example 
fi(x) = x 3 — lx 2 + I6x - 12, / 2 (.v) = .v s — 4.v 2 + x + 6, 



61. Let nti(x) ■ m 2 (x) be a divisor of the minimum 
function m(x) of A, Wi and m 2 not constants. Prove that 
the rank of nii(A) m 2 (A) is less than the rank of nti(A). 

62. Without using any theorem which depends upon 
determinants, prove that the degree of m(x) is less than 
or equal to n when A has complex elements. 

63. If A has rational elements and m(x) is the poly- 
nomial of lowest degree with rational coefficients such 
that m(A)=0, prove that m(x) is the polynomial of 
lowest degree with complex coefficients such that m(A) 
= 0 . 

64. If A has elements in a field F, and if F*DF, 
show that the row rank of A with respect to F is equal 
to the row rank of A with respect to F*. 

65. Show that (a — l) 3 is the minimum function of 



and find a vector in the null space of [l{A)] 2 which is 
not in the null space of 1(A), where l(x) denotes * — 1. 

66. Let k{x) be a polynomial of lowest degree such 
that fc(i4) •<£ = (). Prove that if 1(A ) -0 = 0, then k(x) 
divides l(x). 

67. Let h(x) be any polynomial. Prove that h(A) <t> 
is annihilated by the k(A) of Ex. 66. 

68. Prove that 0, A<t> } A 2 4>> • • • , i4‘-ty are k linearly 
independent vectors annihilated by k(A) where k is the 
degree of k(x) of Ex. 66. 
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69 . Let ^ be a vector which is not annihilated by any 
matrix m^A) where tn 0 (x) is a proper divisor of the 
minimum function m(x) of A. Prove that if A is of 
order n, m(x ) is of degree 5>«. (Cf. Ex. 62.) 

70 . If A has the minimum function m(x) = mi(x) 
• m t (x) where m x {x) and m 2 (x) are relatively prime, then 
if mi(A) X = 0, show that there exists a matrix B such 
that X = nti(A)'B. 

71 . Reduce to the rational canonical form 


--2 

8 

10 

8 

-1 

S 

S 

4 

3 ■ 

-7 

-7 

-7 

.-2 

5 

7 

6. 


Chapter VII 72-77 

72 . The matrix 

- 12 14 0- 

4 6-2 

.-10 -8 10 . 


has elements in the ring of rational integers. Reduce it 
to the canonical form of Theorem 70. 

73. Reduce to normal form by means of elementary 
transformations : 


-* 1 

0 x 

0 0 

0 0 

-0 0 


0 0 

0 0 

x 0 

0 x - 1 
0 0 


0 - 
0 
0 
0 

x- 1. 
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74. Find the elementary divisors and invariant fac 
tors of 


'**(* - 1)* 0 0 

0 *(* - 1)* 0 
0 0 *- 1 

0 0 0 


75. Determine the invariant factors of 


- 2x 

3 

0 

1 

4x 

3(x+2) 

0 

x+2 

0 

6x 

X 

2x 

x - 1 

0 

x - 1 

0 

-3(* - 1) 

1 — X 

2(x - 1) 

0 


o- 

0 

0 

x_ 


x~ 

2x 

0 

0 

0. 


76. The elementary divisors of the characteristic, 
matrix of a matrix A are 

2-x+x\ (2 — X+**)*, 5+3**+x», (5+3x ! +x*)*. 

Determine the rational canonical form of A . 

77. Find the rational canonical form of 

- -5 3 -3 6- 

6 0 4 -5 

-2 - 1-1 1 
.-10 3 -6 10. 


Chapter VIII 78 

78. The vectors 

(1,3,2), (-1,2, -1), (1,4,2) 

span the space Sj of dimension 3. Find an orthogonal 
basis for S%. 








